基金月报信息提取
v2.3.0基金月报信息提取。支持单文件上传和批量处理文件夹。自动学习Excel模板,从PDF月报提取数据,生成两份Excel(PDF信息Excel + 用户模板Excel)。
Security Scan
OpenClaw
Benign
high confidencePurpose & Capability
Name/description match the included scripts and reference docs: the Python scripts and references implement PDF text extraction, optional OCR, template learning, and Excel generation. Required tools (pdfplumber, openpyxl, pdf2image, pytesseract) are appropriate for the described functionality.
Instruction Scope
SKILL.md and the references strictly describe reading user-provided PDFs/Excel and writing output Excel files. The code only touches files in user-specified paths or temporary directories; there are no instructions to read system credential files, user profiles, or other unrelated data. The batch-processing behavior (scan a user-specified folder) is documented and consistent with the skill purpose.
Install Mechanism
This is instruction-only (no automated installer). The docs require several Python packages and system binaries (Tesseract, Poppler). Those are reasonable for OCR/PDF processing, but they are system-level dependencies and must be installed manually (not provided by the skill). Users should install them in a virtual environment and ensure the platform supports running tesseract/pdftoppm.
Credentials
The skill requests no environment variables, no credentials, and no config paths. All file I/O is limited to user-supplied folders and temporary dirs. There are no unrelated secrets or external service tokens requested.
Persistence & Privilege
The skill does not request permanent presence or elevated privileges (always=false). It writes output files to user-specified locations (or a documented default remote output path), and learning state is described as ephemeral/in-memory. No system-wide configuration changes are performed.
Assessment
This skill appears coherent and implements local PDF→Excel extraction. Before using it: (1) install Python deps in a virtualenv and install system packages (tesseract, poppler) as documented; (2) do not point the skill at system or sensitive folders—only provide folders containing the monthly reports you want processed; (3) run it in an isolated/test environment first to confirm OCR accuracy and template mapping (OCR can misread chart text); (4) verify generated Excel files for correctness; (5) note the skill may call local binaries (tesseract, pdftoppm) when OCR is used, but it makes no network calls and does not request credentials. Finally, the SKILL.md reference version differs slightly from registry metadata—this is benign but you may want to confirm you have the intended version.Like a lobster shell, security has layers — review code before you run it.
latest
基金月报信息提取
一句话说明:上传Excel模板和PDF月报,AI自动提取数据并按模板格式生成新Excel。
输入
| 类型 | 说明 | 必需 |
|---|---|---|
| Excel模板 | 用户自定义的Excel格式 | 可选 |
| PDF文件 | 基金月报PDF(支持文本/图表/扫描版) | 必需 |
两种上传方式:
- 逐个上传文件:发送完所有文件后说"好了"或"开始处理"
- 批量处理文件夹:提供文件夹路径,AI自动扫描处理
输出
AI自动生成两份Excel:
| 文件 | 说明 |
|---|---|
| PDF信息Excel | AI格式,不同基金分不同Sheet,同一基金不同时间在同一Sheet |
| 用户模板Excel | 按用户模板格式填充,保持原有样式和公式 |
产品功能
✅ 智能提取:文本+OCR双重提取,确保图表数据不遗漏 ✅ 模板学习:自动学习用户Excel模板,精确填入数据 ✅ 批量处理:支持文件夹批量处理,一次处理多个基金 ✅ 自动分类:自动识别基金名称和日期,智能分类 ✅ 格式保持:保持模板原有的数据类型、样式、公式 ✅ 模糊匹配:字段名智能匹配,适应不同表述方式
提取的数据:
- 核心指标:久期、YTM、基金规模
- 分布数据:行业分布、地区分布、信用评级分布
- 其他:十大持仓、派息记录、基金表现
快速开始
方式1:上传文件
用户:[发送模板.xlsx] + [发送PDF文件]
用户:好了
AI:[生成两份Excel]
方式2:批量处理
用户:请处理这个文件夹:/path/to/folder/
AI:[扫描 → 提取 → 生成Excel]
详细说明
references/- 技术文档(模板学习、字段映射、OCR规则、批量处理、交互规则)SECURITY_REVIEW.md- 安全评估报告
Comments
Loading comments...
