Zerox
Convert PDFs, DOCX, PPTX, and images to Markdown using zerox with GPT-4o vision, including OCR for scanned documents.
MIT-0 · Free to use, modify, and redistribute. No attribution required.
⭐ 0 · 646 · 1 current installs · 1 all-time installs
by@otacu
MIT-0
Security Scan
OpenClaw
Suspicious
medium confidencePurpose & Capability
The name/description (convert PDFs/DOCX/PPTX/images to Markdown using zerox and GPT-4o vision) matches the included scripts and package dependency on 'zerox'. Requiring node and an API key for an external model gateway (APIYI_API_KEY) is consistent with calling a hosted model provider.
Instruction Scope
The runtime scripts do more than just run a converter: they will read an API key from process.env.APIYI_API_KEY or attempt to read ~/.openclaw/.env if the env var is absent (this config path was not declared in the registry metadata). The scripts upload document content to a remote model provider (via zerox), which means your documents will be transmitted off-machine; this privacy-affecting behavior is not emphasized in SKILL.md. README additionally instructs modifying the zerox package's openAI.js to point to https://api.apiyi.com/v1 — modifying node_modules is unusual and increases risk.
Install Mechanism
There is no registry install spec. package.json depends on 'zerox' (npm). README instructs running npm install and editing node_modules to change endpoints. Installing a third-party npm package is a normal step, but the README's suggestion to edit dependency source code (openAI.js) to use a third-party gateway increases the attack surface and is atypical.
Credentials
Only a single credential (APIYI_API_KEY) is required, which is proportionate for a gateway to a model provider. However, the code will read that key either from the environment or from ~/.openclaw/.env (not declared as a required config path). Also, that key will grant the skill the ability to send arbitrary document contents to the remote API, so its scope is sensitive and should be treated as high-privilege for data exfiltration concerns.
Persistence & Privilege
The skill does not set always:true, does not modify other skills or system-wide settings, and only writes logs and output into its own output directory. It does spawn detached background processes and issues macOS notifications (osascript), which are expected for a background converter and are scoped to the skill directory.
What to consider before installing
This skill will send your document contents to a remote model provider (via the zerox library). Before installing or running it, consider: 1) Privacy: do not use this on sensitive documents unless you trust the target API (api.apiyi.com or OpenAI endpoints) and its data-retention policy. 2) API key scope: provide a dedicated key with minimal scope or billing limits; storing it in ~/.openclaw/.env is supported by the script but not declared — prefer setting the env var explicitly. 3) README asks you to edit node_modules/zerox/openAI.js to point to api.apiyi.com; editing installed package code is unusual and increases risk — review that file to confirm where requests are sent. 4) Network & dependency risk: installing the 'zerox' npm package will add code that runs network requests; audit dependency versions and source before use. 5) If you need stronger assurance, request the maintainer/source repo, verify the zerox version and its openAI adapter, or run the conversion on a machine without network access to confirm behavior. These items make the skill coherent but worthy of caution.Like a lobster shell, security has layers — review code before you run it.
Current versionv0.1.0
Download ziplatest
License
MIT-0
Free to use, modify, and redistribute. No attribution required.
Runtime requirements
📄 Clawdis
Binsnode
EnvAPIYI_API_KEY
Primary envAPIYI_API_KEY
SKILL.md
Zerox Document Converter
Convert various document formats to Markdown using the zerox library and GPT-4o vision.
Supported Formats
- PDF (scanned and text-based)
- Microsoft Word (DOCX)
- Microsoft PowerPoint (PPTX)
- Images (PNG, JPG, etc.)
- And more via OCR
Convert Document (Foreground)
For small files (< 30 seconds):
node {baseDir}/scripts/convert.mjs <filePath> [outputPath]
Examples
# Convert PDF - saves to {baseDir}/output/document.md by default
node {baseDir}/scripts/convert.mjs "/path/to/document.pdf"
# Convert PDF with custom output path
node {baseDir}/scripts/convert.mjs "/path/to/document.pdf" "/path/to/output.md"
# Convert Word document - saves to {baseDir}/output/document.md
node {baseDir}/scripts/convert.mjs "/path/to/document.docx"
Convert Document (Background)
For large files or scanned PDFs that take minutes:
node {baseDir}/scripts/convert-bg.mjs <filePath> [outputPath]
Features
- Runs conversion in background (no timeout issues)
- Logs progress to
{baseDir}/output/convert-bg.log - Sends macOS notification when complete
- Detached from terminal (safe to close)
Examples
# Convert large scanned PDF in background
node {baseDir}/scripts/convert-bg.mjs "/path/to/scanned-document.pdf"
# Monitor progress
tail -f {baseDir}/output/convert-bg.log
Requirements
APIYI_API_KEY: Your OpenAI-compatible API key (environment variable)
Notes
- The conversion uses GPT-4o vision to extract text, so it works even with scanned documents
- Large documents may take some time to process
- Output is plain Markdown text
Files
5 totalSelect a file
Select a file to preview.
Comments
Loading comments…
