通用文字识别 - General Text Recognition OCR
使用极速数据通用文字识别 API,将图片中的文字识别为文本,支持中英文及多种外语。
MIT-0 · Free to use, modify, and redistribute. No attribution required.
⭐ 2 · 26 · 1 current installs · 1 all-time installs
by极速数据@jisuapi
MIT-0
Security Scan
OpenClaw
Benign
high confidencePurpose & Capability
Name/description state this is an OCR client for the Jisu General Recognition API; the code posts images to https://api.jisuapi.com/generalrecognition/recognize and requires JISU_API_KEY. The requested credential is proportional and directly used.
Instruction Scope
SKILL.md and the script instruct the agent to read local image files (or accept base64) and upload the image data to the third-party Jisu API. This is expected for an OCR skill, but it is a privacy concern: any image you point at (including sensitive images) will be sent off‑site. The instructions also assume the agent will write/hold image files locally (save or convert to base64).
Install Mechanism
There is no install specification (instruction-only plus a script). The script imports the Python requests library but the skill metadata only requires python3 — the skill does not declare or install the 'requests' dependency, which may cause runtime failures unless the environment already has it.
Credentials
Only one environment variable is required (JISU_API_KEY) and it is used as the appkey for API requests. No unrelated credentials or config paths are requested.
Persistence & Privilege
The skill is not always-enabled and does not request elevated platform privileges or modify other skills. It behaves like a normal, on-demand skill.
Assessment
This skill is coherent for its stated OCR purpose, but consider the following before installing:
- Privacy: the skill reads image files you pass and uploads their base64 contents to a third-party (Jisu). Do not use it with images containing sensitive personal, financial, or secret data unless you trust the provider and its data retention policy.
- Credential handling: you must provide JISU_API_KEY. Store that key securely and avoid exposing it in logs or shared prompts.
- Runtime dependencies: the script uses the Python 'requests' library but the skill metadata only lists python3 — ensure the runtime has 'requests' installed (pip install requests) or the script will fail.
- File access: the agent will read any local path you supply; be careful not to pass paths that expose unrelated local files.
- If you need offline processing or stronger data controls, consider an alternative local OCR solution instead of sending images to an external API.Like a lobster shell, security has layers — review code before you run it.
Current versionv1.0.0
Download ziplatest
License
MIT-0
Free to use, modify, and redistribute. No attribution required.
Runtime requirements
🔎 Clawdis
Binspython3
EnvJISU_API_KEY
Primary envJISU_API_KEY
SKILL.md
极速数据通用文字识别(Jisu General Recognition / OCR)
基于 通用文字识别 API 的 OpenClaw 技能,可识别一般网络图片中的文字内容,支持中英文与多种外语:
cnen:中英文(默认)en:英语fr:法语pt:葡萄牙语de:德语it:意大利语es:西班牙语ru:俄语jp:日语
使用前需要在极速数据官网申请通用文字识别服务,文档见:https://www.jisuapi.com/api/generalrecognition/
环境变量配置
# Linux / macOS
export JISU_API_KEY="your_appkey_here"
# Windows PowerShell
$env:JISU_API_KEY="your_appkey_here"
脚本路径
脚本文件:skills/generalrecognition/generalrecognition.py
使用方式与请求参数
当前脚本只需直接传一段 JSON 参数,对应 /generalrecognition/recognize 接口:
1. 从本地图片识别(推荐)
python3 skills/generalrecognition/generalrecognition.py '{"path":"sfz1.jpg","type":"cnen"}'
path:本地图片路径(脚本会读取并转为 base64),支持 JPG/PNG 等,单张图片最大约 500K;type:文字类型,默认cnen,可选en/fr/pt/de/it/es/ru/jp。
2. 直接传 base64 图片内容
如果你在前置流程中已经把图片转成了 base64,可以直接通过 pic 传入(注意不要带 data:image/...;base64, 前缀,只要纯 base64 字符串):
python3 skills/generalrecognition/generalrecognition.py '{
"pic": "<base64_string>",
"type": "cnen"
}'
3. 请求参数说明
| 字段名 | 类型 | 必填 | 说明 |
|---|---|---|---|
| path | string | 二选一 | 本地图片路径,脚本会自动读取并转为 base64 |
| image | string | 二选一 | path 的别名 |
| file | string | 二选一 | path 的别名 |
| pic | string | 二选一 | 已经是 base64 的图片内容(不带前缀) |
| type | string | 否 | 文字类型:cnen/en/fr/pt/de/it/es/ru/jp,默认 cnen |
path/image/file 与 pic 至少提供一个;同时存在时优先使用 pic。
返回结果说明
接口原始返回示例(参考官网文档):
{
"status": 0,
"msg": "ok",
"result": [
"此时此刻我好焦灼!",
"你别再解释了"
]
}
本技能会对返回进行一次轻量封装,统一输出:
{
"result": [
"此时此刻我好焦灼!",
"你别再解释了"
]
}
当出现业务错误时(例如图片为空、格式错误、超过大小限制等),则包装为:
{
"error": "api_error",
"code": 201,
"message": "图片为空"
}
网络或解析错误会返回:
{
"error": "request_failed" | "http_error" | "invalid_json",
"message": "...",
"status_code": 500
}
常见错误码
来源于 通用文字识别文档:
| 代号 | 说明 |
|---|---|
| 201 | 图片为空 |
| 202 | 图片格式错误 |
| 204 | 图片大小超过限制 |
| 208 | 识别失败 |
| 210 | 没有信息 |
系统错误码 101–108 与其它极速数据接口一致。
在 OpenClaw 中的推荐用法
- 用户上传一张带有文字的截图或照片,要求「帮我把图片里的文字全部提取出来」。
- 代理将图片保存为本地文件或转为 base64,再调用:
python3 skills/generalrecognition/generalrecognition.py '{"path":"image.jpg","type":"cnen"}'或传入pic。 - 从返回的
result数组中拼接出完整文本(按行合并或按需要格式化),用自然语言回复用户,并根据场景进一步分析或翻译内容。
Files
2 totalSelect a file
Select a file to preview.
Comments
Loading comments…
