环评知识库提炼

v2.3.0

环评报告知识库提炼工具 - 从环评报告表中提取结构化知识库文件,支持PDF/DOCX解析

0· 174·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for iasgu/eia-knowledge-extractor.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "环评知识库提炼" (iasgu/eia-knowledge-extractor) from ClawHub.
Skill page: https://clawhub.ai/iasgu/eia-knowledge-extractor
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install eia-knowledge-extractor

ClawHub CLI

Package manager switcher

npx clawhub@latest install eia-knowledge-extractor
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description (environment impact report → structured CSV knowledge bases) matches the included Python scripts and declared dependencies (pymupdf, pandas, python-docx). The skill does not request unrelated credentials or system config paths. The presence of extract_* and main.py scripts is coherent with the stated purpose.
Instruction Scope
SKILL.md instructs running scripts/main.py on PDF/DOCX and installing the listed Python libs — that matches the code. However, multiple extraction functions are currently placeholders (they return empty lists or 'pass'), so in practice some outputs may be empty or require further implementation and manual review. Also SKILL.md lists '5' CSV knowledge files but the code supports writing an extra emission-standards CSV (i.e., 6 outputs) — a minor documentation mismatch to be aware of.
Install Mechanism
No install specification (instruction-only for environment setup) and the requirements.txt lists standard PyPI packages. No downloads from arbitrary URLs, no archive extraction, and no package managers invoked automatically. Risk is low; user should install dependencies in a virtualenv to limit scope.
Credentials
The skill requires no environment variables, no credentials, and reads only user-supplied report files and writes CSV/text output locally. There are no requests for unrelated secrets or system-level configs.
Persistence & Privilege
Registry flags (always: false, normal model invocation) are standard. The skill writes output files to disk (CSV, reports) under the specified output directory — expected for this tool. It does not modify other skills or system-wide agent settings.
Assessment
This skill appears to do what it says: parse EIA report files and produce CSV knowledge bases. Before installing or running it: 1) run it in an isolated environment (virtualenv/container) and inspect outputs; 2) test with non-sensitive sample PDFs to confirm the extractor and table parsing meet your needs — many extract_* functions are currently stubs and may produce empty CSVs; 3) note the small doc mismatch (SKILL.md says 5 CSVs, code writes an additional emission-standards CSV) and verify the exact files produced; 4) review produced CSVs for correctness before using them in downstream systems; and 5) install dependencies from PyPI only (pip install -r requirements.txt) and avoid running untrusted binaries.

Like a lobster shell, security has layers — review code before you run it.

latestvk971e1t2b5je58x3a9222e5m1n830qcc
174downloads
0stars
1versions
Updated 1mo ago
v2.3.0
MIT-0

环评知识库提炼

从环境影响评价报告表中自动提取环境数据,生成结构化知识库文件。

输入

环评报告表文件(支持PDF、DOCX、DOC、TXT格式)

输出

生成5个知识库CSV文件:

1. 污染因子知识库(19字段)

污染物ID、行业、区域、产污工段、产污设施、原辅材料、污染物条件1-3、污染物名称、污染因子名称、污染物种类、排放位置、标准条件1-3、适用标准、标准限值-浓度/速率/高度/其他、备注、出处

2. 废气源强核算知识库(15字段)

污染物ID、污染物种类、污染因子种类、行业、区域、核算污染因子、产生量核算方法类型/依据/方法、核算公式、所需参数、产污系数、类比项目规模信息/污染物量、出处

3. 废水源强核算知识库(15字段)

同上

4. 固废源强核算知识库(16字段)

污染物ID、污染物种类、污染因子种类、固废类型、危废代码、行业、区域、核算污染因子、产生量核算方法类型/依据/方法、核算公式、所需参数、产污系数、类比项目规模信息/污染物量、出处

5. 噪声源强核算知识库(13字段)

污染物ID、污染物种类、污染因子种类、行业、区域、噪声源、规格型号、声源类型、计量单位/方式、声源源强值、降噪措施、降噪后源强值、出处

数据规范

  • 污染物ID格式行业_原辅料_产污工段_其他条件_污染物
  • 示例通用设备制造业_铸件_抛丸_/_抛丸粉尘
  • 污染物种类:废气、废水、固废、噪声
  • 空值:用 / 表示

质量核查

生成后需核查:文件完整性(5个CSV)、命名规则、字段完整性(19/15/15/16/13)、与原报告一致性(不可缩写/缩减/缺漏)

使用方法

python scripts/main.py report.pdf -o output_dir

依赖

pip install pymupdf pandas python-docx

Comments

Loading comments...