Invoice-Recognition
WarnAudited by ClawScan on May 10, 2026.
Overview
The invoice extractor mostly matches its stated purpose, but it includes real-looking hardcoded Baidu API credentials and sends invoice images/PDF content to Baidu OCR.
Review carefully before installing. Use your own Baidu OCR credentials, do not rely on any embedded keys, and rotate/remove any exposed credentials if you maintain this package. Process only invoice files you are authorized to upload to Baidu, preview batch directories with --list, and install dependencies in an isolated environment.
Findings (4)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
If these are real credentials, they may belong to another account, may be abused by others, or may cause user invoice processing to be attributed to an unknown Baidu application/account.
The setup guide shows concrete-looking Baidu API credentials rather than placeholders, while the static scan also flags exposed secret literals in examples.md and scripts/batch_process.py.
BAIDU_API_KEY=3yrSX2UuhRpzgdiLBD3D1GDr BAIDU_SECRET_KEY=ZP6MY4DF6RR6GQhD66p5xrifSWXk2TZl
Remove all embedded API keys/secrets, rotate any exposed Baidu credentials, and require users to provide their own credentials through config.txt, environment variables, or a secure secret store.
Invoice images may include company names, tax IDs, bank account details, addresses, and transaction amounts that leave the local machine for Baidu OCR processing.
The extractor base64-encodes invoice images/PDF pages and posts them to Baidu OCR, which is central to the stated purpose but sends sensitive invoice content to an external provider.
INVOICE_URL = "https://aip.baidubce.com/rest/2.0/ocr/v1/vat_invoice" ... params = {"image": image_data} ... requests.post(url, data=paramsUse only with invoices you are allowed to send to Baidu, review Baidu’s data handling terms, and prefer the preview/list mode before batch processing large directories.
Future installs could resolve to newer dependency versions than the author tested, which can affect reliability or supply-chain exposure.
The install path uses PyPI packages with lower-bound version constraints rather than exact pinned versions or hashes.
requests>=2.28.0 pandas>=2.0.0 openpyxl>=3.1.0 PyMuPDF>=1.23.0 Pillow>=10.0.0
Install in a virtual environment and prefer a reviewed lockfile or pinned dependency versions for production use.
A broad directory selection could upload and export more invoice files than intended.
The skill can recursively process many local invoice files in one run, which is expected for batch invoice extraction but increases the impact of choosing the wrong directory.
Process all invoice files in a directory (recursive) ... Batch processing: Process hundreds of invoices in one command
Run the documented --list preview mode first and use narrow input directories for sensitive invoice batches.
