Back to skill
Skillv1.0.0
ClawScan security
发票内容识别 · ClawHub's context-aware review of the artifact, metadata, and declared behavior.
Scanner verdict
SuspiciousMar 13, 2026, 6:43 AM
- Verdict
- suspicious
- Confidence
- high
- Model
- gpt-5-mini
- Summary
- The skill mostly does what it says (uses Baidu OCR to extract invoice data), but the package metadata fails to declare required credentials (.env BAIDU_API_KEY / BAIDU_SECRET_KEY) and thus the manifest and runtime instructions are inconsistent — review before installing or providing keys.
- Guidance
- This skill appears to implement what it claims, but there are a few important things to check before installing or running it: - Metadata mismatch: The registry lists no required env vars, but the script and SKILL.md both require BAIDU_API_KEY and BAIDU_SECRET_KEY in a .env file. Do not provide credentials until you confirm where and how they will be stored. - Data exposure: The script uploads image data to Baidu OCR endpoints (aip.baidubce.com). Invoices contain sensitive personal and financial data — ensure you have authorization and that sending this data to Baidu complies with your privacy, regulatory, and corporate policies. - Secrets handling: The included scripts/.env is a template. Avoid committing real keys to source control. Prefer environment-specific secret storage, rotate keys regularly, and use least-privilege API credentials if supported. - Run in an isolated environment: Execute the script in a controlled environment (local VM, container, or isolated workspace) so it cannot accidentally read other files. Review the full invoice_ocr_main.py file yourself (or with a security colleague) to confirm there are no unexpected network endpoints or hidden behaviors in the truncated section. - Test with non-sensitive samples first: Validate functionality using synthetic or redacted invoices to confirm behavior and outputs before processing real data. - Ask the publisher to fix metadata: Request that the skill's registry metadata explicitly list BAIDU_API_KEY and BAIDU_SECRET_KEY as required env vars and describe data flows (which endpoints it contacts). This improves transparency and trust. If you need, I can extract and show the remaining truncated portion of invoice_ocr_main.py for a complete review, or produce a short checklist you can use to safely run this skill.
Review Dimensions
- Purpose & Capability
- noteThe skill's name/description (VAT invoice OCR via Baidu) aligns with the code and SKILL.md: the script converts PDF/images to JPEG, calls Baidu VAT OCR and falls back to general OCR, and writes Excel results. That capability legitimately requires Baidu API credentials and image-processing libraries. However, the registry metadata claims 'Required env vars: none' while both SKILL.md and the script require BAIDU_API_KEY and BAIDU_SECRET_KEY — a clear metadata omission.
- Instruction Scope
- okSKILL.md instructions are narrowly scoped to the stated task: check for .env with Baidu keys (or ask user), install Python deps, render PDF pages, call Baidu OCR, evaluate results, and write an Excel. The instructions explicitly require user-provided credentials and do not instruct broad system reconnaissance. They do instruct installing pip packages and reading the input file and local .env.
- Install Mechanism
- noteNo install spec is provided (instruction-only), but a Python script is included. The runtime requires pip installing commonly used libraries (pymupdf, openpyxl, requests, Pillow, python-dotenv). No external arbitrary downloads or obscure installers are used. The absence of an install spec in registry metadata is an operational omission but not an immediate danger.
- Credentials
- concernThe code and SKILL.md require BAIDU_API_KEY and BAIDU_SECRET_KEY (via a .env file), which is appropriate for calling Baidu OCR. The concern is that the skill registry metadata does not declare these required environment variables, so an install could be attempted without the user realizing credentials are necessary. Also note that providing these credentials permits the script to send invoice images (sensitive PII) to Baidu's servers — this is expected for a cloud OCR integration but has privacy/consent implications.
- Persistence & Privilege
- okThe skill does not request always:true and does not modify other skills or global agent configuration. It runs as a normal, user-invoked/autonomously-invokable skill without elevated or persistent system privileges.
