文献精读小工具
AdvisoryAudited by Static analysis on Apr 30, 2026.
Overview
No suspicious patterns detected.
Findings (0)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
Text from selected PDFs may leave the local machine and be processed by the configured model provider.
The extracted PDF text is merged into a prompt and sent to the configured OpenAI-compatible LLM provider.
resp = client.chat.completions.create(... messages=[{'role': 'user', 'content': merged_prompt}],)Review config.json base_url/model/provider settings before use, and use a trusted or internal endpoint for confidential documents.
Full PDFs may be transmitted to the OCR service when OCR mode is turned on.
If PaddleOCR mode is enabled, the raw PDF file is uploaded to the configured OCR job endpoint.
requests.post(job_url, headers=headers, data=data, files={'file': f}, timeout=120)Keep use_paddleocr disabled unless needed, and only enable it with an OCR endpoint you trust for the documents being processed.
Users must provide service credentials, and storing them directly in config.json could expose them to anyone who can read that file.
The skill uses provider API keys from config.json or environment variables to authenticate OCR/LLM calls.
`api_key` 支持两种写法:在config.json中指定;环境变量引用:`${ENV_VAR}`Prefer environment-variable references, use least-privilege provider keys where available, and avoid committing config files containing keys.
The exact dependency versions installed can vary over time.
The skill depends on Python packages installed from package repositories with lower-bound version ranges rather than exact pinned versions.
openai>=1.40.0 requests>=2.31.0 pdfplumber>=0.11.0
Install in a virtual environment and consider pinning or reviewing dependency versions for sensitive environments.
Passing a broad folder could process and potentially send more PDFs to model providers than intended.
When a directory is supplied, the pipeline recursively collects all PDF files under that directory.
items.extend(sorted(Path(d).rglob('*.pdf')))Use specific PDF paths or narrowly scoped folders, especially when working with private document collections.
