HUDC Bidding Information Capture

v6.0.0

Intelligently analyzes SGCC bidding documents in Word/PDF/Excel formats, extracting 23 key fields with automated qualification backfilling and deadline highl...

1· 137·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for jktllsqaq/hudc-bidding-information-capture.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "HUDC Bidding Information Capture" (jktllsqaq/hudc-bidding-information-capture) from ClawHub.
Skill page: https://clawhub.ai/jktllsqaq/hudc-bidding-information-capture
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install hudc-bidding-information-capture

ClawHub CLI

Package manager switcher

npx clawhub@latest install hudc-bidding-information-capture
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description promise (extract 23 fields from Word/PDF/Excel SGCC bidding docs) aligns with the provided script and config: the script reads .docx/.pdf/.xlsx, extracts tables and paragraph text, applies regex and keyword engines, and writes a structured Excel report.
Instruction Scope
SKILL.md directs the agent to run the shipped analyze.py and operate on local ~/Desktop/sgcc_files/ project folders; the script's behavior (parsing docs, extracting fields, prompting for manual補盲 when needed, and returning results) stays within the stated purpose. Note: the skill will read file contents in the specified directories (including contact names/phones and any PII in documents).
Install Mechanism
There is no registry install spec, but analyze.py will attempt to auto-install python-docx, openpyxl and pdfplumber via os.system('pip3 install ...'). This is expected for the functionality but does modify the Python environment (uses --break-system-packages and runs pip silently). It's a moderate-risk behavior (network download of third‑party packages) but the packages are standard and coherent with purpose.
Credentials
The skill declares no environment variables, no credentials, and the code does not read env vars or request unrelated secrets. It only reads local files and writes an output Excel on the user's Desktop which is proportional to its task.
Persistence & Privilege
Skill is not marked always:true and does not request persistent or elevated system privileges. It writes its results to ~/Desktop/sgcc_result.xlsx and uses local config/keywords.json only; it does not modify other skills or system-wide agent settings.
Assessment
This skill appears to do what it says: parse local招标 documents and produce a structured Excel. Before installing/running, consider: (1) review scripts/analyze.py yourself (it will run pip to install python-docx, openpyxl, pdfplumber), (2) run it in a controlled environment or virtualenv to avoid altering your global Python environment, (3) be aware it will read files under the configured input directory (may contain PII or confidential vendor/customer data), (4) confirm you are comfortable with automatic pip installs (the packages are common but pip will download from PyPI), and (5) do not run it as root. If you want extra assurance, inspect the remaining truncated portions of analyze.py (the attachment-loading and Excel-writing logic) to confirm there are no unexpected network calls or shell executions before first run.

Like a lobster shell, security has layers — review code before you run it.

Bidding and tenderingvk9708zv4m5r84jq0e36123mtjn84hbp1data analysisvk9708zv4m5r84jq0e36123mtjn84hbp1efficiencyvk9708zv4m5r84jq0e36123mtjn84hbp1information retrievalvk9708zv4m5r84jq0e36123mtjn84hbp1latestvk9708zv4m5r84jq0e36123mtjn84hbp1toolsvk9708zv4m5r84jq0e36123mtjn84hbp1
137downloads
1stars
1versions
Updated 2w ago
v6.0.0
MIT-0

hbdc · 国网招标文件分析 Skill (v6)

目录约定

~/Desktop/sgcc_files/
├── 项目A/
│   ├── xxx-招标公告.docx          ← 主公告 (docx 或 pdf, 自动识别)
│   ├── 公告附件_资质要求.xlsx      ← 可选, 资质占位符回填用
│   └── 重要提醒.docx               ← 自动忽略
├── 项目B/
│   └── xxx-采购公告.pdf
├── 散文件_招标公告.docx            ← 根目录散放, 当成单独项目
└── ...

输出报告: ~/Desktop/sgcc_result.xlsx


执行方式

直接运行固定脚本, 禁止自己写代码替代:

python3 ~/.openclaw/workspace/skills/hbdc/scripts/analyze.py

v6 核心策略

1. Word/PDF 主抽 → Excel 补盲 → 段落回退

优先级来源说明
1Word/PDF 表格主公告里的包级明细表
2xlsx 附件需求表/资质附件补充
3Word/PDF 正文无表格时段落扫描 (标记需手动补盲)

2. PDF 专属标题提取

pdfplumber 字符级字号定位封面最大字号文字作为项目名称, 比正则扫描段落更准确。PDF 读取失败后自动切换参数重试一次, 若仍失败则在终端明确提示。

3. "详见附件X" 占位符识别与回填

资质列若为 详见附件1 / 详见附件二 等纯占位符:

  1. 自动扫描同目录下 xlsx 中的资质表
  2. (分标编号, 包号) 主键回填
  3. 找不到时降级为项目级通用资质 (前缀 【通用】)

4. 关键词去冗余

长词命中后自动丢弃其子串短词:

  • 咨询服务 命中 → 丢弃 咨询
  • 储能系统 命中 → 丢弃 储能
  • 宣传服务 命中 → 丢弃 宣传

5. 资质列合并策略

包级 资质条件 + 业绩要求 + 主要人员
  ↓ (为空或占位符)
xlsx 附件资质表回填
  ↓ (仍为空)
项目级资格要求章节 (过滤套话后, 前缀 【通用】)

过滤的模板套话: 依法注册、失信被执行人、信用中国、联合体、破产、黑名单等。

保留的实质条款: 甲/乙/丙级、建造师、ISO、安全生产许可证、总承包、业绩 等。


输出 23 列

#列名来源
1序号自动
2项目名称PDF字号法 → 段落正则
3招标编号正则
4分标编号Word表格 → xlsx
5分标名称Word表格 → xlsx
6包号Word表格 → xlsx
7需求部门/签订单位Word表格 → xlsx
8子项目名称Word表格 → xlsx
9项目概况与招标范围Word表格 → xlsx
10资质/资格要求包级聚合 → 附件回填 → 项目级兜底
11合同文本编号Word表格 → xlsx
12实施地点Word表格 → xlsx
13工期/服务期Word表格 → xlsx
14报价方式Word表格 → xlsx
15预算金额(万元)Word表格 → xlsx (元自动换万元)
16最高限价Word表格 → xlsx
17招标起止时间正则 (截止时间染色)
18开标时间地点正则
19投标保证金正则 (含"不收取"判断)
20评标办法正则 (综合评估/合理低价/综合评分等)
21联系人及电话正则 + xlsx 联系人字段
22匹配关键词关键词引擎 (去冗余)
23匹配来源Word表格 / PDF表格 / Excel附件 / 正文

截止时间染色规则

颜色条件
🔴 红色距截止 ≤ 3 天
🟡 黄色距截止 ≤ 7 天
⬜ 灰色已过截止时间
无色7 天以上, 正常

执行流程

步骤一: 运行脚本

python3 ~/.openclaw/workspace/skills/hbdc/scripts/analyze.py

步骤二: 汇报结果

向用户汇报:

  1. 扫描了多少个项目 + 每个项目识别的元数据 (名称/编号/时间/开标/保证金/评标办法)
  2. 每个项目「抽出多少包级记录 → 关键词命中多少条」
  3. 命中记录的来源标签
  4. 报告路径: ~/Desktop/sgcc_result.xlsx
  5. 特别提醒「来源是 Word/PDF正文」的项目 — 需手动补盲

步骤三: AI 辅助补盲

对以下情况主动接管:

  • PDF 表格抽取失败 → 直接打开 PDF 阅读页面内容手动填写
  • 来源标签为「Word/PDF正文」→ 手动定位关键章节填补字段
  • 项目名称识别错误 → 从原文找正确标题
  • 资质/资格要求 仍含 "详见附件" → 读取对应附件补全

步骤四: 回答追问

用户问某项目细节时, 直接读取对应文件夹下的 word/pdf/xlsx 展示。


自定义关键词

编辑 config/keywords.json 文件, 修改后无需改动脚本:

{
  "keywords": {
    "我的自定义类": ["关键词A", "关键词B"]
  },
  "short_keywords": {
    "我的自定义类": ["短词"]
  }
}

禁止行为

  1. 禁止自己编写分析脚本, 必须运行 analyze.py
  2. 禁止用文件名匹配关键词, 必须读文件内部内容
  3. 禁止打开浏览器, 本技能只操作本地文件
  4. 禁止直接在 SKILL.md 里修改关键词, 使用 config/keywords.json

Comments

Loading comments...