文档识别-表格识别-Pro(翔云开放平台)

v1.0.1

翔云通用文档/表格识别 Agent。当用户请求以下操作时触发: - 通用文档识别、文档 OCR、OCR 文档、识别图片文字 - 表格识别、表格 OCR、识别表格 - 提取表格内容、读取表格数据 - 识别图片/扫描件/PDF 中的文字和表格 - 表格转 Excel、表格转 Word、表格转 Markdown - 导...

0· 106·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for liudengkui/xiangyun-table-ocr.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "文档识别-表格识别-Pro(翔云开放平台)" (liudengkui/xiangyun-table-ocr) from ClawHub.
Skill page: https://clawhub.ai/liudengkui/xiangyun-table-ocr
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install xiangyun-table-ocr

ClawHub CLI

Package manager switcher

npx clawhub@latest install xiangyun-table-ocr
Security Scan
Capability signals
Requires sensitive credentials
These labels describe what authority the skill may exercise. They are separate from suspicious or malicious moderation verdicts.
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description (table/ OCR recognition) match the included Python script and SKILL.md. The script calls netocr.com endpoints for recognition and download and only requires the OCR key/secret — this is proportionate for the declared functionality.
Instruction Scope
Instructions and the script read user-provided file paths or directories, convert some image formats, send image data to netocr.com, save recognition JSON and exported files locally, and can prompt the user for credentials. Those behaviors are necessary for OCR, but they do mean: (1) images (and any content in files you point at) are uploaded to netocr.com; (2) credentials may be saved in a plaintext ./config.json in the skill directory.
Install Mechanism
No install spec (instruction-only + included Python script). The script depends on common libraries (requests, optional Pillow). Nothing is downloaded from unfamiliar URLs or written system-wide during install.
Credentials
No required environment variables; the script optionally reads NETOCR_KEY / NETOCR_SECRET and loads ./config.json or asks the user. Requiring the OCR key/secret is expected. Be aware the skill encourages saving credentials in plaintext config.json in the skill directory (persistent storage of secrets).
Persistence & Privilege
always is false and the skill does not request elevated system-wide privileges. It writes its own config.json and output files in the skill or user-specified directories only, and does not modify other skills or global agent settings.
Assessment
This skill appears to do what it says (upload images to netocr.com and return OCR/table results). Before installing: (1) confirm you trust netocr.com as images are uploaded to that third-party service; (2) prefer setting NETOCR_KEY/NETOCR_SECRET as environment variables rather than saving them in the skill's plaintext ./config.json, or keep the config file on a protected filesystem; (3) the script will read any file paths or folders you give it — avoid pointing it at directories containing sensitive files you do not want uploaded; (4) the code rewrites the returned OSS URL to use oss-cn-beijing.aliyuncs.com with a Host header to retrieve exported files (this is explained in the doc and is a workaround for the provider's returned http URL, but you may wish to verify the returned host and the downloaded content); (5) if you ever suspect the key was exposed, rotate it at the provider. Overall the package is internally consistent with its stated purpose.

Like a lobster shell, security has layers — review code before you run it.

latestvk97fzd62byqz4k3jx1yhahfcb5859gxj
106downloads
0stars
2versions
Updated 1w ago
v1.0.1
MIT-0

翔云通用文档/表格识别

功能概述

调用翔云 OCR 平台的通用文档识别 API(typeId: 3050),对图片、PDF、扫描件中的文字、表格、版面结构进行一体化识别,并支持导出为 Excel、Word、Markdown、PDF、TXT、OFD 等多种格式。

适用场景

类别示例
纯表格财务报表、数据表格、对账单、工程量表
含表格文档合同、报告、说明书、论文、试卷
纯文字文档证件、发票、手写稿、扫描件
多语言文档英文合同、日文资料、繁体文档、多语言混排

💡 表格识别建议layout: 1(开启版面分析)对表格结构识别更友好。


⚠️ 安全说明

数据发送范围

  • 识别阶段:用户图片和 API 凭据会被发送至 netocr.com 进行云端 OCR 处理,图片不会在服务端持久化存储
  • 下载阶段:OSS 导出会话凭证来自 API 返回的预签名 URL,文件直接下载到本地

SSL 处理策略

  • 主 APInetocr.com):完整 SSL 证书验证(requests 默认行为)
  • OSS 下载product.netocr.com):翔云返回 http://product.netocr.com/... 预签名 URL,该域名是阿里云 OSS cn-beijing 的 CNAME。脚本将请求目标替换为阿里云 OSS 官方域名(oss-cn-beijing.aliyuncs.com),并通过 Host 头携带原始域名以使预签名校验通过,全程使用标准 HTTPS,无任何 SSL 配置修改

⚠️ 凭据配置

配置文件

将凭据保存到 Skill 目录下的 config.json

// config.json(Skill 同目录下)
{
  "key": "你的OCRKey",
  "secret": "你的OCRSecret"
}

💡 首次使用:创建 config.json,填入凭据即可。配置一次,永久使用。

凭据加载优先级

优先级来源说明
1./config.jsonSkill 自目录配置文件
2环境变量NETOCR_KEY / NETOCR_SECRET
3用户输入前两者都没有时,向用户索要

首次配置流程

  1. 检查 ./config.json 是否存在且含 keysecret
  2. 若不存在或不完整,向用户提示:
首次使用翔云文档识别,请配置 API 凭据:

1. 前往 https://netocr.com 注册并登录
2. 进入【个人中心】获取 API Key 和 Secret
3. 请提供:
   - key:______
   - secret:______
  1. 收到后写入 ./config.json,并提示用户"凭据已保存,后续无需重复输入"

触发词参考

触发表达对应意图
"识别这个文档"、"OCR 这张图片"、"读取图片文字"通用文档识别
"识别这张表格"、"提取表格数据"表格识别(自动 layout=1)
"英文合同 OCR"、"识别日文资料"多语言识别
"帮我识别 PDF"、"扫描件文字提取"PDF/扫描件识别
"这张发票识别一下"证件/票据识别
"表格转 Excel"、"导出为 Markdown"识别 + 导出
"歪斜文档识别"、"图片有点歪"带校正的识别
"批量识别文件夹里所有图片"批量识别

执行流程

阶段一:识别文档

Step 1:加载凭据

按【凭据配置】章节顺序加载 key / secret

  • config.json → 环境变量 → 用户输入

Step 2:获取图片输入

支持以下方式:

  • 本地文件路径:用户提供绝对路径,脚本读取后转 Base64
  • 用户拖入文件:直接获取文件路径
  • 批量目录:用户提供文件夹路径,遍历所有图片

Step 3:配置识别参数

固定参数

{ "typeId": 3050, "format": "json" }

语言参数 nLanguage(默认 0=简体中文):

语言语言
0简体中文(印刷)9法文
1繁体中文(印刷)10西班牙文
2英文11日文
3简体中文(印刷+手写)12韩文
4繁体中文(印刷+手写)13葡萄牙文
5阿拉伯文14越南文
6乌尔都文15孟加拉文
8西里尔文(俄文等)

💡 语言推断:提到"英文"→ 2;"日文"→ 11;"繁体"→ 1;未指定→ 0

版面参数 layout

含义适用
0关闭版面分析纯文字、证件
1开启版面分析表格首选、多栏文档

图像校正参数

参数触发条件
autoRotation1图片自动判断是否旋转
inclineCorrect0/1/2不矫正/透视畸变/弯曲畸变校正

预处理参数

参数触发条件
removeWaterMark1去除水印
filterColor1~4滤红/滤蓝(背景干扰时)

Step 4:调用识别 API

执行脚本 scripts/recognize_table.py

python scripts/recognize_table.py --image <路径> --export xls

API 接口:

  • Base64POST https://netocr.com/api/recog_table_base64
  • File 上传POST https://netocr.com/api/recog_table_file

响应格式:{"message": {"status": 0, "value": {...}}}

  • status == 0 表示成功
  • consumeIdmessage.value.consumeId

Step 5:展示结果

  • 以 Markdown 表格预览识别内容
  • 告知 consumeId,提示可随时导出

阶段二:导出文件(按需触发)

仅当用户明确提出"导出/下载/转换/保存为"时才执行。

导出格式

格式说明推荐场景
xlsExcel数据处理
flowWordWord 文字流正文编辑
boxWordWord 文本框保留排版
mdMarkdown文档转换
pdf双层 PDF存档打印
txt纯文本简单提取
ofdOFD国产格式归档

下载接口POST https://netocr.com/api/download_file

  • 不需要 key/secret
  • 返回 OSS 预签名 URL → 再发 GET 请求下载实际文件

批量识别

  1. 遍历目录下所有 jpg/png/jpeg/webp/tif/pdf 文件
  2. 逐一调用识别 API(间隔 0.5 秒)
  3. 汇总展示成功/失败统计
  4. 按需批量导出

图片/文件要求

类型要求
支持格式PNG、JPG、JPEG、WEBP、TIF、OFD、PDF
普通图像约 200KB,位深度 24 以上
扫描件分辨率 300DPI,小于 3M

错误处理

错误码含义处理
20001Key/Secret 错误检查配置文件凭据
10001缺少必要参数检查 typeId/format
10002识别失败改善图片质量,开 autoRotation
10003额度不足充值或更换账号
10004图片格式不支持转为 JPG/PNG 后重试
下载失败consumeId 过期重新识别后再下载

获取 key 和 secret

  1. 登录翔云
  2. 在个人中心获得

参考文档

详细 API 字段、参数枚举及返回结构,参考:翔云 OCR API 参考

Comments

Loading comments...