ebook-to-md

v1.0.0

Convert PDF/PNG/JPEG/MOBI/EPUB to Markdown. Uses Baidu OCR only. Use when 扫描PDF转Markdown、pdf ocr、图像识别、电子书转Markdown、ebook to markdown.

0· 545·2 current·2 all-time
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Suspicious
medium confidence
!
Purpose & Capability
The skill's stated purpose (convert scanned PDFs/images/epub/mobi to Markdown using Baidu OCR) matches the implementation: the code calls Baidu OCR and a Baidu document parser and can convert MOBI/EPUB via Calibre. However the registry metadata declares no required environment variables or primary credential while SKILL.md and the code clearly require BAIDU_OCR_API_KEY and BAIDU_OCR_SECRET_KEY. This metadata omission is an incoherence worth noting.
!
Instruction Scope
SKILL.md instructs the agent to run the included script and to set Baidu OCR credentials. The implementation uploads user files (PDFs/images/ebooks converted to PDF) to Baidu endpoints (OAuth token, OCR endpoint, paddle-vl parser) and downloads parser results and images. That means user documents are transmitted to an external service (Baidu). The instructions and code do not provide strong warnings about this privacy/exfiltration risk. The code also fetches image URLs found in parser-generated HTML, which could trigger additional outbound network requests.
Install Mechanism
No install spec is provided (instruction-only plus shipped Python scripts). No arbitrary remote downloads or package installs beyond normal Python deps (requests) and optional Calibre. This is lower risk from an installation/execution provenance perspective.
!
Credentials
Functionally the skill needs Baidu API credentials (BAIDU_OCR_API_KEY, BAIDU_OCR_SECRET_KEY) to work; SKILL.md documents this and the tests skip OCR cases if these are unset. The registry metadata, however, lists no required env vars and declares no primary credential — an inconsistency that could mislead users into installing without realizing a cloud credential is required. No other unrelated secrets are requested.
Persistence & Privilege
The skill does not request persistent/always-on inclusion and does not modify other skills or system-wide settings. It does optionally load a .env file via python-dotenv if present (standard behavior), but this is limited and expected for a script that needs API keys.
What to consider before installing
This skill will upload the documents you give it (PDFs, images, converted EPUB/MOBI) to Baidu OCR/document-parser services to produce Markdown. If those documents contain sensitive or private information, do not use this skill unless you are comfortable sending that data to Baidu. Also note a metadata mismatch: the registry lists no required environment variables, but the SKILL.md and code require BAIDU_OCR_API_KEY and BAIDU_OCR_SECRET_KEY — verify that you supply credentials knowingly. If you need offline processing or stronger privacy guarantees, prefer a tool that does OCR locally (e.g., Tesseract/PaddleOCR run locally) or review the code thoroughly before running. Finally, review any fixtures or tests if you plan to run them (they may try to access Calibre or skip tests if keys are absent).

Like a lobster shell, security has layers — review code before you run it.

latestvk974g7n784640ep2mc34h3h1fx81r2ce

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Comments