Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

Fapiao Clipper

v1.5.2

发票夹子 v1.4 - 本地大模型驱动的发票自动识别与报销管理工具。 2级降级链:PyMuPDF文本提取(修复跨行匹配)→ Qwen3-VL视觉模型。 新增:seller/buyer跨行匹配修复、日期标准化。 功能:8项风控验真 + 一键导出 Excel + 合并 PDF。

0· 118·1 current·1 all-time
Security Scan
VirusTotalVirusTotal
Suspicious
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description (local invoice OCR, verification, export) matches the code and files: PDF/OFD handling, PyMuPDF extractor, optional local Ollama Qwen3-VL, SQLite DB, email downloader, blacklist sync and tax-check interactions. Required binary is only python3 and no unrelated cloud credentials are demanded in the skill metadata. The components present are appropriate for the stated purpose.
Instruction Scope
Runtime instructions are limited to cloning the repo, pip installing requirements, configuring config.yaml, and running CLI/web UI commands. The code will read/write files under the user-specified storage path (default ~/Documents/发票夹子) and lets the agent read the SQLite DB. The email watcher will log in to the user's IMAP account, download attachments, extract links from email HTML and follow those links (including re-requesting forms) to retrieve PDFs — behavior needed for 'auto fetch invoices' but it means the skill fetches external URLs and writes downloaded payloads locally. This is within scope but worth noting as an I/O/network surface that can pull arbitrary remote content if present in mail.
Install Mechanism
No automated install spec is embedded in the skill metadata (instruction-only), but SKILL.md / README instruct cloning from the GitHub homepage and pip installing requirements.txt. That is a normal install path. The repo includes executable Python code (not just prose), so installing and running will execute that code locally. No suspicious remote binary downloads or URL-shortener installs are used in the provided install instructions; Docker compose references local services (Ollama) and optional env vars.
Credentials
The skill declares no required env vars in metadata, which aligns with shipping a config file-based tool. Operationally, the tool requires IMAP credentials (username/password) in config to enable mailbox scanning, and may require local Ollama or optional third-party API keys if you choose those providers (config example shows siliconflow.api_key, docker-compose shows DASHSCOPE_API_KEY and OLLAMA_BASE_URL). These credential needs match the features (email scanning, local vision model, optional cloud provider) and are not excessive, but the user must supply them in plaintext config.yaml — treat those credentials as sensitive and protect config file permissions.
Persistence & Privilege
Skill is not force-installed (always: false) and does not request to modify other skills or system-wide agent settings. It stores data locally (SQLite DB, inbox directory, exports) in the user-specified storage path. Allowing the agent to read the DB is intentional for answering invoice queries; autonomous invocation is allowed by platform default but is not combined with additional privileged flags here.
Assessment
This repository appears to implement exactly what it claims: a local invoice OCR and reimbursement helper. Before installing or running it, consider the following: - Credentials/config: The email watcher expects IMAP username/password in config/config.yaml — these will be stored in plaintext in that file unless you take other measures. Limit file permissions (chmod 600) and keep config out of backups if you don't want credentials stored elsewhere. - Network I/O: The email component will download attachments and follow links found in email HTML (including re-posting form actions) to retrieve PDFs. This is necessary for auto-download but increases the risk of fetching malicious content embedded in emails. If you enable mail scanning, run it on a trusted machine or in an isolated environment. - Local services: OCR fallback uses a local Ollama model (Qwen3-VL) or optional cloud providers. If you use a cloud provider (siliconflow etc.), you will need to supply an API key — review those settings in config.yaml and requirements.txt before enabling. - Privacy claims: The project advertises 'zero upload' — code shows downloads from tax.gov for blacklist/verification and clicking the tax bureau check link; verification likely involves querying public tax-check endpoints. Review verifier.py (not shown fully in the bundle) to confirm it only queries public verification endpoints and does not post invoice contents to third-party services. - Exposed interfaces: README documents options to expose the Web UI (Tailscale/frp or running in Docker). If you enable remote access, ensure you secure access (VPN/Tailscale, firewall rules) because the Web UI can read the local invoice DB and exports. - Dependency audit: Inspect requirements.txt and vet dependencies before pip install. Consider installing into a dedicated virtualenv or container. - Least privilege: If you only need local PDF/image processing (no mail auto-fetch), leave email.enabled=false and run manual scans to reduce network exposure. If you want deeper analysis, provide the full verifier.py and the complete requirements.txt so I can check whether any dependency or verification code sends invoice data to third-party endpoints beyond the stated tax-check/blacklist lookups.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

🧾 Clawdis
Binspython3
latestvk977qvfqbqzet961r2f9q51b8n84c1th
118downloads
0stars
6versions
Updated 1w ago
v1.5.2
MIT-0

发票夹子 (Invoice Clipper) v1.3

纯 Python CLI 工具,OpenClaw / Claude Code / KimiClaw 等任何 Agent 平台均可使用。

v1.3 重大更新

简化架构为 2 级(2026-04-03):

  • 第1级:PyMuPDF 文本提取(修复跨行匹配)
  • 第2级:Qwen3-VL 视觉模型(备用)
  • 去掉 GLM-OCR(不稳定)和 TurboQuant(未启用)

设计理念

发票 → 放文件夹
      ↓
PDF 提取文字(两种引擎可选)
      ↓ 读不出才走第2级
视觉模型(扫描件才触发)
      ↓
存入 SQLite 数据库
      ↓
Agent 直接读数据库回答问题 ← 完全不消耗 API token

二级识别链 (v1.3)

级别引擎触发条件特点
第1级PyMuPDF可搜索 PDF(默认)毫秒级,无需Java
第2级Ollama Qwen3-VL图片/扫描件~6.1GB 内存

大部分发票走第1级,零成本。

数据库(Agent 直接读)

发票处理后存在 ~/Documents/发票夹子/invoices.db(SQLite)。

Agent 可以直接用自然语言读数据库,例如:

  • "这个月收到哪些发票?"
  • "有没有超过365天的发票?"
  • "XX公司的发票有吗?"

不需要额外调用任何大模型 API,Agent 用自己的上下文就能直接读。

命令速查

用户意图执行命令
扫描发票python3 {baseDir}/main.py scan
列出发票python3 {baseDir}/main.py list
查询日期python3 {baseDir}/main.py query --from 2026-03-01 --to 2026-03-31
标记不报销python3 {baseDir}/main.py exclude <ID>
恢复报销python3 {baseDir}/main.py include <ID>
导出报销python3 {baseDir}/main.py export --from 2026-03-01 --to 2026-03-31 --format both
批量验真python3 {baseDir}/main.py verify
查看问题发票python3 {baseDir}/main.py problems
同步黑名单python3 {baseDir}/main.py blacklist-sync

意图识别规则

用户说执行的命令
"扫描发票" / "整理邮箱"scan
"本月发票" / "列出所有"list
"XX商家发票"query --seller XX
"导出报销"export --from ... --to ... --format both
"不要报销#3那张"exclude 3

Agent 平台使用

零配置(推荐首次使用)

不想编辑 YAML?运行交互向导,回答几个问题即可:

python3 {baseDir}/setup_config.py

安装

git clone https://github.com/Alan5168/fapiao-clipper.git
cd fapiao-clipper
pip install -r requirements.txt
cp config/config.yaml.template config/config.yaml

注意事项

  • 原文件永不删除,exclude 仅标记
  • 发票有效期默认 365 天(可配置)
  • 有 OpenClaw/Claude Code → 第1级搞定后,Agent 直接读数据库,不消耗 API

Comments

Loading comments...