Mineru Pdf Parser
Security checks across static analysis, malware telemetry, and agentic risk
Overview
The skill appears purpose-aligned for converting user-selected PDFs with MinerU, but users should be comfortable sharing PDFs and a MinerU token with that service.
Install only if you trust MinerU with the PDFs you parse and are comfortable providing a MinerU API token. Use a dedicated output folder, clear results for confidential documents, and be cautious when enabling automatic paper-workflow parsing.
Static analysis
No static analysis findings were reported for this release.
VirusTotal
VirusTotal findings are pending for this skill version.
Risk analysis
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
The skill can make MinerU API requests using the user’s token, potentially consuming account quota or accessing account-linked processing results.
The skill clearly requires a MinerU API token and discloses that the token is sent to MinerU, which is expected for this integration but still grants account-level API authority.
你愿意提供你的 MinerU API Token (`MINERU_TOKEN`); Token 会被发送给 https://mineru.net/
Use a dedicated MinerU token if available, keep it out of shared logs, and revoke or rotate it when no longer needed.
Private or sensitive PDFs supplied to the skill will be uploaded to the external MinerU service for processing.
For local-file mode, the script reads the specified local file and uploads its bytes to an upload URL obtained through the MinerU API; this is central to the PDF parsing purpose but means local document contents are shared externally.
with open(fp, "rb") as f:
r = requests.put(up_url, data=f)Only parse PDFs you are allowed to share with MinerU, and avoid using automatic workflows on directories or documents containing sensitive material unless that sharing is acceptable.
Downloaded result files may overwrite or add files inside the chosen output directory.
The script downloads a provider-supplied result ZIP and extracts it into the output directory. This is expected for retrieving parsed results, but archive extraction should be treated as a file-write operation from an external source.
r = requests.get(url, stream=True)
...
with zipfile.ZipFile(zip_path, "r") as z:
z.extractall(out_dir)Keep MinerU output in a dedicated results directory and avoid pointing extraction at sensitive or shared folders.
Converted Markdown may remain on disk after parsing and could later be read by the user or other workflows.
The skill stores parsed document output persistently in the user’s home directory, which is expected for a parser but can retain sensitive extracted text for later use.
解析结果保存在 `~/.openclaw/MinerU_Results/` 目录下。
Delete or protect result directories when parsing confidential PDFs, and treat extracted Markdown from untrusted PDFs as untrusted content.
Installing dependencies in the wrong environment or from an untrusted package source could affect the local Python setup.
The skill relies on a user-run, unpinned Python dependency installation rather than a declarative install spec. This is common and proportionate for a small API wrapper, but users should install from a trusted package index.
pip install requests
Install dependencies in a virtual environment from the official Python package index or another trusted mirror.
