Install
openclaw skills install ppx-parseParse PDFs and images into Markdown/JSON using the `ppx` CLI. Use when the user asks to OCR scanned PDFs or screenshots, extract tables from PDFs, convert PDF/image to Markdown, preserve document layout, inspect parsing output. Also triggers on: 解析PDF、图片转文字、扫描件识别、扫描件转文字、提取表格、 PDF转Markdown、文档解析、OCR识别、识别图片文字、解析图片、提取文档内容。
openclaw skills install ppx-parseUse the local ppx CLI to parse PDFs and images into structured Markdown and JSON.
>= 3.12.ppx is missing, read references/troubleshooting.md and create a virtual environment before installing dependencies.version synchronized from the repository pyproject.toml with scripts/sync_version.py.>= 3.12.scripts/check_ppx_env.sh.ppx is missing, create or use a virtual environment and install PPX there.--ocr auto by default.--ocr yes for scanned PDFs or screenshots.--ocr no for native PDFs when OCR causes noise.--table auto by default.--table llm only when the user needs highest table accuracy and an LLM backend is configured.ppx parse <input> -o <output>.doc.mddoc.jsonpages/images/ when figures are extractedreferences/.ppx parse report.pdf -o output/
ppx parse scan.pdf --ocr yes -o output/
ppx parse figure.png -o output/
ppx parse report.pdf --pages "1-5,10" -o output/
ppx parse report.pdf --table llm --backend deepseek -o output/
doc.md, doc.json, or page-level files.references/cli-options.md when choosing parse flags.references/backend-config.md when using DeepSeek, Paddle, or GLM backends.references/troubleshooting.md when PPX is missing, Python is too old, or runtime dependencies fail.