Moark Doc Extraction
v1.0.0Extract and recognize text from documents, including PDF and DOCX files.
⭐ 0· 83·1 current·2 all-time
by@fchange
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
OpenClaw
Benign
high confidencePurpose & Capability
Name/description, required environment variable (GITEEAI_API_KEY), and the script all point to using Gitee AI document parsing endpoints (ai.gitee.com). The credential requested is consistent with the stated purpose.
Instruction Scope
The SKILL.md instructs the agent to run the bundled script and extract a line starting with 'EXTRACTION_RESULT:'. The script actually prints a line 'EXTRACTION_RESULT:' and then prints the extracted text on subsequent line(s) (i.e., the OCR text is not on the same line as the label). This mismatch could break naive parsers. SKILL.md also suggests displaying the result using a particular markdown/image-like syntax and asks the agent not to add commentary. Additionally, the script accepts either a local file path or a URL and will fetch URLs, which is expected for this use case but means untrusted URLs could cause the runtime to make arbitrary network requests (including to internal endpoints).
Install Mechanism
No install/download mechanism is provided (instruction-only with a bundled script). Dependencies are standard Python packages (requests, requests-toolbelt) mentioned in the script comments and SKILL.md; nothing is downloaded from unknown or unsafe locations by the installer.
Credentials
Only a single environment variable (GITEEAI_API_KEY) is required and is justified by the use of the Gitee AI API. The script does not read other unrelated env vars or config paths.
Persistence & Privilege
The skill does not request permanent presence or elevated agent privileges (always is false) and does not modify other skills or system-wide agent settings.
Assessment
This skill appears to do exactly what it claims: upload a supplied PDF/DOCX (or fetch a URL) to the Gitee AI async document parse API and return extracted text. Before installing, consider: 1) Confirm you trust the Gitee AI service and are comfortable providing your GITEEAI_API_KEY. 2) Update your agent's output-parsing logic to handle the script's actual output format (the script prints 'EXTRACTION_RESULT:' on one line and the extracted text on following line(s) rather than 'EXTRACTION_RESULT: <text>' on a single line). 3) Be cautious about supplying document URLs from untrusted sources — the script will fetch them (this can reach internal network addresses if the runtime has network access). 4) Note the script requests include_image_base64=true, so images may be included in API responses (potentially large or sensitive). 5) Ensure the environment has the listed Python dependencies available or install them in a controlled environment before use.Like a lobster shell, security has layers — review code before you run it.
latestvk97c8a63p9qsjkg38p9z373jpx83he70
License
MIT-0
Free to use, modify, and redistribute. No attribution required.
Runtime requirements
📖 Clawdis
EnvGITEEAI_API_KEY
Primary envGITEEAI_API_KEY
