pdf to word

PDF conversion toolkit featuring AI layout analysis and OCR. Converts PDFs to Word/Docx, Markdown, JSON, PPT, CSV, HTML, and XML for seamless LLM data proces...

MIT-0 · Free to use, modify, and redistribute. No attribution required.
1 · 85 · 0 current installs · 0 all-time installs
byComPDF@youna12345
MIT-0
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description match the implementation: the script wraps the ComPDFKitConversion SDK to convert PDFs/images into multiple formats. Required components (SDK package, license.xml, documentai.model) are coherent with AI layout/OCR functionality.
Instruction Scope
SKILL.md and the script instruct the agent to download two remote artifacts from download.compdf.com (license.xml and a ~525MB documentai.model) and to read the <key> field from license.xml for local SDK license verification. The skill does not request unrelated files or credentials, but it does perform network downloads on first run which users should expect and consent to.
Install Mechanism
There is no automated install spec in the registry; the SKILL.md requires the user to pip install ComPDFKitConversion. That is reasonable for a Python wrapper, but installing arbitrary pip packages has supply-chain risk — verify the package origin (PyPI vs private index), package maintainer, and integrity before running.
Credentials
The skill declares no required environment variables or credentials. It offers one optional override (COMPDF_DOCUMENT_AI_MODEL) to point to a local model file. No unrelated secrets or cloud credentials are requested.
Persistence & Privilege
The skill is not always-enabled and uses normal invocation. It does not request system-wide configuration changes or other skills' credentials. Its automatic behavior is limited to downloading vendor-provided license/model files into its own scripts/ directory.
Assessment
This skill appears to do what it says, but before installing: 1) Be aware it will (by default) download a vendor license file and a large AI model (~525MB) from download.compdf.com on first run — expect network activity and disk use. 2) The skill expects you to install a proprietary Python package (ComPDFKitConversion) via pip — verify the package source and trustworthiness (PyPI/maintainer). 3) The downloaded license.xml contains a license key that the script reads locally for SDK verification; the script does not appear to exfiltrate that key, but if you need to avoid network downloads, pre-place license.xml and documentai.model into the scripts/ directory or set COMPDF_DOCUMENT_AI_MODEL to a local path. 4) Review the ComPDF terms/privacy (links are provided in License.txt) if you will process sensitive documents. If you want extra assurance, inspect the installed ComPDFKitConversion package or run the script in an isolated environment before using it on production data.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.0.0
Download zip
latestvk9754btjhap85tav8f7ymx8285837vkr

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Runtime requirements

📑 Clawdis

SKILL.md

pdf to word

Purpose

  • Wraps the ComPDFKitConversion Python SDK into a reusable local conversion workflow, supporting PDF / image to Word, PPT, Excel, HTML, RTF, Image, TXT, JSON, Markdown, and CSV (10 output formats in total).

Agent Skills Standard Compatibility

  • This Skill uses an Anthropic Agent Skills-compatible directory structure: pdf-to-word-docx/.
  • The entry point is SKILL.md; helper scripts are placed in scripts/.
  • The document uses $ARGUMENTS and ${CLAUDE_SKILL_DIR} conventions for distribution and execution in Claude Code / Agent Skills-compatible environments.

Input / Output

  • Input: The target format (word/excel/ppt/html/rtf/image/txt/json/markdown/csv), the PDF or image path, and the output path are passed via Skill arguments or the command line. An optional PDF password and conversion parameters may also be provided.
  • Supported input file types:
    • PDF files (.pdf)
    • Image files (.jpg/.jpeg/.png/.bmp/.tif/.tiff/.webp/.jp2/.gif/.tga)
  • Output: A file in the corresponding format (.docx, .pptx, .xlsx, .html, .rtf, image, .txt, .json, .md, .csv), or a clear error message.

Prerequisites

  • Supports Windows and macOS.
  • The conversion SDK must be installed first:
    pip install ComPDFKitConversion
    
  • On first run, the script automatically downloads license.xml from the ComPDF server and caches it in the scripts/ directory:
    https://download.compdf.com/skills/license/license.xml
    
  • The script reads the <key>...</key> field from license.xml and uses that key for LibraryManager.license_verify(...) authentication — it does not pass the XML file path directly to the SDK.
  • To use a custom license, place your own license.xml in the scripts/ directory; the script will use it directly without downloading.
  • During SDK initialization, the resource directory is always set to the directory containing pdf-to-word-docx.py, i.e., the scripts/ directory itself.
  • When --enable-ocr or --enable-ai-layout (enabled by default) is used, the Skill also requires scripts/documentai.model. If the file does not exist, the script will automatically download it from:
    https://download.compdf.com/skills/model/documentai.model
    
  • To reuse an existing model file, you can override the default model path via an environment variable:
    export COMPDF_DOCUMENT_AI_MODEL="/path/to/documentai.model"
    

Workflow

  1. Confirm the Python package is installed:
    python -m pip show ComPDFKitConversion
    
  2. The script automatically downloads license.xml on first run; the scripts/ directory is used directly as the SDK resource path.
  3. In Agent Skills / Claude Code environments, prefer using the Skill's built-in script path variable:
    python "${CLAUDE_SKILL_DIR}/scripts/pdf-to-word-docx.py" word input.pdf output.docx
    python "${CLAUDE_SKILL_DIR}/scripts/pdf-to-word-docx.py" ppt input.pdf output.pptx
    python "${CLAUDE_SKILL_DIR}/scripts/pdf-to-word-docx.py" excel input.pdf output.xlsx
    
  4. For more control, append common parameters:
    python "${CLAUDE_SKILL_DIR}/scripts/pdf-to-word-docx.py" excel input.pdf output.xlsx --page-ranges "1-3,5" --excel-all-content --excel-worksheet-option for-page
    python "${CLAUDE_SKILL_DIR}/scripts/pdf-to-word-docx.py" word input.pdf output.docx --enable-ocr --page-layout-mode flow
    
  5. On startup, the script ensures scripts/license.xml exists (downloading it automatically from the ComPDF server if missing), reads the <key> field for SDK authentication, and uses the scripts/ directory as the resource path.
  6. If --enable-ocr or --enable-ai-layout (enabled by default) is active, the script checks whether scripts/documentai.model exists; if not, it downloads the file automatically before initializing the Document AI model.
  7. Check the return code; if it is not SUCCESS, handle license, password, resource, model, or input file issues according to the error name.

documentai.model Download Optimization

  • The script preferentially uses the model file pointed to by COMPDF_DOCUMENT_AI_MODEL.
  • The default model path is scripts/documentai.model.
  • During automatic download, the file is first written to documentai.model.part and then atomically renamed to the final file upon success, preventing partial file corruption.
  • On download failure, the script retries automatically with back-off intervals of 2s / 5s / 10s.

Invoking Directly as a Skill

  • In environments that support Agent Skills, the Skill can be called directly:
    /pdf-to-word-docx word input.pdf output.docx
    /pdf-to-word-docx excel input.pdf output.xlsx --excel-worksheet-option for-page
    
  • When the Skill receives arguments, it passes them through to the script as-is:
    python "${CLAUDE_SKILL_DIR}/scripts/pdf-to-word-docx.py" $ARGUMENTS
    
  • If the environment does not support direct Skill invocation, fall back to a regular command-line call.

Supported Output Formats

  • word → calls CPDFConversion.start_pdf_to_word
  • excel → calls CPDFConversion.start_pdf_to_excel
  • ppt → calls CPDFConversion.start_pdf_to_ppt
  • html → calls CPDFConversion.start_pdf_to_html
  • rtf → calls CPDFConversion.start_pdf_to_rtf
  • image → calls CPDFConversion.start_pdf_to_image
  • txt → calls CPDFConversion.start_pdf_to_txt
  • json → calls CPDFConversion.start_pdf_to_json
  • markdown → calls CPDFConversion.start_pdf_to_markdown
  • csv → reuses CPDFConversion.start_pdf_to_excel with table/Excel parameters to produce CSV-friendly output

Input Source Types

  • The script supports PDF and image as input sources. The SDK's start_pdf_to_* interfaces natively accept image files with no pre-processing required.
  • By default, the script auto-detects the input type from the file extension:
    • .pdfpdf
    • .png/.jpg/.jpeg/.bmp/.tif/.tiff/.gif/.webp/.tgaimage
  • You can also specify the source type explicitly:
    python "${CLAUDE_SKILL_DIR}/scripts/pdf-to-word-docx.py" word input.png output.docx --source-type image
    
  • image -> * and pdf -> * share the same set of CPDFConversion.start_pdf_to_* interfaces; only the input file type differs.

Smart Defaults

The script automatically adjusts certain parameters based on the input source and output format to reduce manual configuration:

TriggerAutomatic BehaviorUser-OverridableDescription
Input source is an image (auto-detected or explicit --source-type image)Automatically enables --enable-ocrNo (--enable-ocr uses store_true; there is no --no-enable-ocr)Text in images must be extracted via OCR; without OCR, output will contain only images and no text
Output format is HTML (format = html)Automatically sets --page-layout-mode to box (box layout)Yes — passing --page-layout-mode flow explicitly overrides thisBox layout better preserves the original formatting in HTML; specify flow explicitly if flow layout is needed

When triggered, the script prints a notice to stderr, for example:

Auto-enabled OCR for image input.
Auto-set page layout mode to BOX for HTML output.

All Parameters

Positional Parameters

ParameterDescription
formatTarget format: word/excel/ppt/html/rtf/image/txt/json/markdown/csv
input_pdfInput file path (PDF or image)
output_pathOutput file path

General Parameters

ParameterTypeDefaultDescription
--source-typeOptionautoInput source type: auto/pdf/image
--passwordString""PDF open password
--page-rangesStringNonePage range, e.g. 1-3,5
--font-nameString""Output font name

Layout Parameters

ParameterTypeDefaultDescription
--enable-ai-layoutBooleanTrueAI layout analysis (disable with --no-enable-ai-layout)
--page-layout-modeOptionSDK default flow (auto-switched to box for HTML output)Page layout: box (box layout) / flow (flow layout)

Content Retention Parameters

ParameterTypeDefaultDescription
--contain-imageBooleanTrueRetain images (disable with --no-contain-image)
--contain-annotationBooleanTrueRetain annotations (disable with --no-contain-annotation)
--contain-page-background-imageBooleanTrueRetain page background images (disable with --no-contain-page-background-image)
--formula-to-imageBooleanFalseConvert formulas to image output
--transparent-textBooleanFalsePreserve transparent text

Output Control Parameters

ParameterTypeDefaultDescription
--output-document-per-pageBooleanFalseSplit output into one document per page
--auto-create-folderBooleanTrueAutomatically create output directory (disable with --no-auto-create-folder)

OCR Parameters

ParameterTypeDefaultDescription
--enable-ocrBooleanFalse (auto-enabled for image input)Enable OCR
--ocr-optionOptionSDK default allOCR scope: invalid-character/scan-page/invalid-character-and-scan-page/all
--ocr-languageMulti-selectautoOCR language(s); multiple languages can be specified simultaneously. Options: auto/chinese/chinese-tra/english/korean/japanese/latin/devanagari/cyrillic/arabic/tamil/telugu/kannada/thai/greek/eslav

Excel-Specific Parameters

ParameterTypeDefaultDescription
--excel-all-contentBooleanFalseInclude all content in Excel output
--excel-csv-formatBooleanFalseOutput Excel result in CSV format
--excel-worksheet-optionOptionSDK default for-tableWorksheet split strategy: for-table/for-page/for-document

JSON-Specific Parameters

ParameterTypeDefaultDescription
--json-contain-tableBooleanTrueInclude table data in JSON output (disable with --no-json-contain-table)

TXT-Specific Parameters

ParameterTypeDefaultDescription
--txt-table-formatBooleanTrueEnable table formatting in TXT output (disable with --no-txt-table-format)

HTML-Specific Parameters

ParameterTypeDefaultDescription
--html-optionOptionSDK default single-pageHTML output mode: single-page/single-page-with-bookmark/multiple-page/multiple-page-with-bookmark

Image-Specific Parameters

ParameterTypeDefaultDescription
--image-typeOptionSDK default jpgImage output format: jpg/jpeg/jpeg2000/png/bmp/tiff/tga/gif/webp
--image-color-modeOptionSDK default colorImage color mode: color/gray/binary
--image-scalingFloat1.0Image scaling factor
--image-path-enhanceBooleanFalseEnable image path enhancement

Parameter Default Value Rules

  • Parameters that default to True (--enable-ai-layout/--contain-image/--contain-annotation/--contain-page-background-image/--auto-create-folder/--json-contain-table/--txt-table-format) use BooleanOptionalAction; pass --no-xxx to disable.
  • Parameters that default to False (--enable-ocr/--formula-to-image/--transparent-text/--output-document-per-page/--excel-all-content/--excel-csv-format/--image-path-enhance) use store_true; passing the flag enables them.
  • All CLI parameter defaults are fully consistent with the SDK's ConvertOptions() defaults — omitting a parameter is equivalent to using the SDK's original default value.

Recommended Command Examples

PDF to Word (default parameters, AI layout analysis enabled)

python "${CLAUDE_SKILL_DIR}/scripts/pdf-to-word-docx.py" word input.pdf output.docx

PDF to Word, box layout, no images, no AI layout analysis

python "${CLAUDE_SKILL_DIR}/scripts/pdf-to-word-docx.py" word input.pdf output.docx --no-enable-ai-layout --no-contain-image --page-layout-mode box

PDF to Word, retain annotations and background images, one document per page

python "${CLAUDE_SKILL_DIR}/scripts/pdf-to-word-docx.py" word input.pdf output.docx --output-document-per-page

PDF to Excel, include all content and split worksheets by page

python "${CLAUDE_SKILL_DIR}/scripts/pdf-to-word-docx.py" excel input.pdf output.xlsx --excel-all-content --excel-worksheet-option for-page

PDF to TXT, with table formatting enabled

python "${CLAUDE_SKILL_DIR}/scripts/pdf-to-word-docx.py" txt input.pdf output.txt

PDF to HTML, multi-page with bookmarks mode

python "${CLAUDE_SKILL_DIR}/scripts/pdf-to-word-docx.py" html input.pdf output_dir --html-option multiple-page-with-bookmark

PDF to Image, PNG format, grayscale, 2x scaling

python "${CLAUDE_SKILL_DIR}/scripts/pdf-to-word-docx.py" image input.pdf output.png --image-type png --image-color-mode gray --image-scaling 2.0

Image to Word (OCR auto-enabled, specify Chinese language)

python "${CLAUDE_SKILL_DIR}/scripts/pdf-to-word-docx.py" word input.png output.docx --ocr-language chinese

Note: For image input, the script automatically enables OCR — there is no need to pass --enable-ocr manually. To specify an OCR language, --ocr-language can still be used.

PDF with OCR enabled (multiple languages)

python "${CLAUDE_SKILL_DIR}/scripts/pdf-to-word-docx.py" word input.pdf output.docx --enable-ocr --ocr-language chinese english japanese

Trial License and Usage Limits

  • The scripts/license.xml auto-downloaded from the ComPDF server is a Trial License, allowing a maximum of 200 conversions.
  • The script uses a SHA-256 fingerprint to detect whether the current License is the default trial key; no usage limit applies when using any other License.
  • After each successful conversion using the trial License, the script prints the current used/remaining count to stderr, for example:
    Trial license: 5/200 conversions used, 195 remaining.
    
  • When the trial limit is reached (200 conversions), the script refuses to convert and prompts the user to purchase a full License:
    Error: Trial license usage limit reached (200 conversions). Please purchase a license at: https://www.compdf.com/contact-sales
    
  • When the trial License has expired (SDK authentication fails), the error message also includes a purchase link.
  • After purchasing a full License, place a custom license.xml containing the new <key> in scripts/ (overwriting the auto-downloaded trial file) — no script modifications or counter file cleanup are required.

Confirmed Facts

  • ComPDFKitConversion 3.9.0 has been successfully installed on the local machine.
  • The installed package provides 10 conversion methods including CPDFConversion.start_pdf_to_word/start_pdf_to_ppt/start_pdf_to_excel.
  • LibraryManager provides initialize, license_verify, release, set_document_ai_model, and set_ocr_language.
  • Official documentation confirms support for PDF to Word / Excel / PPT / HTML / RTF / Image / TXT / JSON / Markdown.
  • The SDK's start_pdf_to_* interfaces natively accept image file input (PNG → Word has been verified successfully).
  • enable_ai_layout defaults to True in the SDK; set_document_ai_model() must be called first to load the model before use, otherwise a 0xC0000005 crash will occur.
  • --ocr-language supports specifying multiple languages simultaneously (e.g. --ocr-language chinese english).

Risks / Notes

  • The official requirements page states Python >=3.6, while the demo page states <3.11, but PyPI currently provides a cp314 wheel in practice; treat the locally installable wheel as the source of truth, but always verify installation in a new environment first.
  • If the script cannot download license.xml from the server (network issue) and no manual file exists in scripts/, or the <key> field is empty, the script cannot complete SDK authentication and cannot perform any real conversions.
  • documentai.model is a large file (approximately 525 MB); there will be a noticeable download delay the first time OCR / AI layout is enabled. Because --enable-ai-layout defaults to True, the model download will be triggered on the very first run.
  • If the runtime environment cannot access https://download.compdf.com/skills/model/documentai.model, place documentai.model in the scripts/ directory in advance.
  • Do not directly apply the initialization patterns from ComPDF SDKs for other languages to the Python package; this Skill is based on the locally verified LibraryManager / CPDFConversion API.

Resource Navigation

  • License file: License.txt
  • Script: scripts/pdf-to-word-docx.py
  • SDK authentication file: scripts/license.xml (auto-downloaded from https://download.compdf.com/skills/license/license.xml if missing)
  • SDK authentication source: the <key> field in license.xml
  • SDK resource path: scripts/
  • OCR / AI layout model: scripts/documentai.model (auto-downloaded if missing)
  • Purchase a full License: https://www.compdf.com/contact-sales
  • Official documentation:
    • https://www.compdf.com/guides/conversion-sdk/python/overview
    • https://www.compdf.com/guides/conversion-sdk/python/pdf-to-word
    • https://www.compdf.com/guides/conversion-sdk/python/pdf-to-excel
    • https://www.compdf.com/guides/conversion-sdk/python/pdf-to-ppt
    • https://www.compdf.com/guides/conversion-sdk/python/apply-license

Acceptance Checklist

  • python -m pip show ComPDFKitConversion shows the installed package
  • Running python "${CLAUDE_SKILL_DIR}/scripts/pdf-to-word-docx.py" --help or an equivalent local command produces normal output
  • The script auto-downloads scripts/license.xml if missing, then extracts the license key from the <key> field for authentication
  • The script uses the scripts/ directory as the SDK resource path
  • The script recognizes all 10 target formats: word/excel/ppt/html/rtf/image/txt/json/markdown/csv
  • The script accepts both PDF and image files (.png/.jpg/.jpeg/.bmp/.tif/.tiff/.gif/.webp/.tga) as input
  • When --enable-ocr or --enable-ai-layout (enabled by default) is active and documentai.model is missing, the script auto-downloads the model
  • When license.xml cannot be obtained (download fails and no manual file exists) or authentication fails, a clear error is output rather than a silent failure
  • The 7 parameters that default to True can be disabled with --no-xxx
  • --ocr-language supports specifying multiple languages simultaneously
  • After a conversion using the trial License, the usage count increments
  • When the trial License reaches 200 conversions, the script refuses to convert and outputs a purchase link
  • When using a non-trial License, no usage limit applies
  • For image input, even if --enable-ocr is not passed, the script automatically enables OCR and prints a notice to stderr
  • For HTML output, even if --page-layout-mode is not passed, the script automatically uses box (box layout) and prints a notice to stderr
  • For HTML output, explicitly passing --page-layout-mode flow overrides the automatic box layout behavior

Distribution Notes

  • This Skill does not depend on any machine-specific absolute paths.
  • When distributing to other users, the following directory structure is sufficient:
    pdf-to-word-docx/
    ├── SKILL.md
    ├── License.txt
    └── scripts/
        └── pdf-to-word-docx.py
    
  • Users place this directory under their own skills root directory and the Skill is ready to use.
  • license.xml is auto-downloaded at runtime; no need to include it in the distribution package.

Common Pitfalls

  • scripts/license.xml is missing and cannot be auto-downloaded (network unavailable or server error): the script will error out before authentication. If you are in an offline environment, place license.xml manually in the scripts/ directory.
  • scripts/license.xml is missing the <key> field or its value is empty: the script will error out before authentication.
  • SDK resource files required by the SDK are absent from the scripts/ directory: conversion may fail after LibraryManager.initialize().
  • A password-protected PDF is provided without --password: this will trigger PDF_PASSWORD_ERROR.
  • OCR / AI layout is enabled but documentai.model is not present locally and the network is unavailable: the model download will fail; place the file in the scripts/ directory manually in advance.
  • When the Excel output strategy is unclear, prefer passing --excel-worksheet-option explicitly to avoid unexpected result structures.
  • When converting images to other formats, the script already enables OCR automatically; if the output still contains no text, check whether documentai.model is complete and whether the OCR language matches.
  • Once the trial License usage limit is exhausted, a full License must be purchased to continue; purchase link: https://www.compdf.com/contact-sales.

Copyright

This Skill is built on top of the ComPDFKit Conversion SDK.

© 2014-2026 PDF Technologies, Inc., a KDAN Company. All Rights Reserved.

Important: Under the ComPDFKit Terms of Service, distributing the documentation, sample code, or source code of the ComPDFKit Conversion SDK to third parties is prohibited. Please ensure you have obtained a valid ComPDFKit License before using this Skill.

Files

3 total
Select a file
Select a file to preview.

Comments

Loading comments…