Image OCR
SiliconFlow OCR for screenshots, receipts, forms, and tables with mixed Chinese/English extraction. Use when users ask 提取图片文字/识别截图/OCR表格/票据识别. Supports local...
MIT-0 · Free to use, modify, and redistribute. No attribution required.
⭐ 0 · 19 · 0 current installs · 0 all-time installs
MIT-0
Security Scan
OpenClaw
Suspicious
medium confidencePurpose & Capability
The skill is an OCR client for a SiliconFlow API and the included Python script performs exactly that task (sending images and a prompt to https://api.siliconflow.cn). However, the registry metadata lists no required credentials while the SKILL.md and script clearly require a SILICONFLOW_API_KEY or a key file — this metadata mismatch is incoherent.
Instruction Scope
SKILL.md and the script stay within OCR scope: they send a text prompt and either a local image (encoded as a data URI) or an image URL to the SiliconFlow chat/completions endpoint. They instruct storing/reading a key at ~/.openclaw/secrets/siliconflow_api_key and doing network access to api.siliconflow.cn. There is no other file-system or network activity, but reading a secrets file and transmitting image data to a third-party service are material actions to be aware of.
Install Mechanism
No install spec; the skill is delivered as a small Python script (no downloads or package installs). This is low-risk from an installer perspective.
Credentials
The skill requires an API key (SILICONFLOW_API_KEY) according to its docs and script, but the registry metadata declares no required env vars. The script also checks a local path (~/.openclaw/secrets/siliconflow_api_key) — a shared secrets directory that could overlap with other skills. Requesting one API key for the OCR service is proportionate, but the metadata/manifest inconsistency and the use of a shared secrets path raise concerns that should be clarified.
Persistence & Privilege
The skill does not request always:true, does not modify other skills or global settings, and is user-invocable only. It does read a single secrets file for its own key but does not request ongoing system-wide privileges.
What to consider before installing
This skill appears to be a straightforward client that sends images to SiliconFlow's API for OCR, which fits its name. Before installing or using it: (1) Confirm the registry metadata is updated to declare SILICONFLOW_API_KEY as a required credential (the current manifest omits it). (2) Understand that images (and any text in them) will be transmitted to https://api.siliconflow.cn — do not send sensitive documents unless you trust the service and its privacy policy. (3) The script will read ~/.openclaw/secrets/siliconflow_api_key if the env var is missing; ensure that file is only used for this service and has restricted permissions. (4) If you prefer, set the API key via an environment variable rather than a shared file, and test with non-sensitive images first. (5) If you need higher assurance, run the script in an isolated environment and/or review the network traffic to confirm no other endpoints are contacted.Like a lobster shell, security has layers — review code before you run it.
Current versionv1.0.0
Download ziplatest
License
MIT-0
Free to use, modify, and redistribute. No attribution required.
SKILL.md
Image OCR
Extract text from screenshots, receipts, forms, and tables with SiliconFlow OCR. Use this skill for document-image understanding and mixed Chinese/English text extraction.
Why install this
Use this skill when you want to:
- read text from screenshots, scans, invoices, or forms
- handle mixed Chinese/English OCR in one workflow
- send local paths, URLs, or multimodal prompts to the same OCR entrypoint
Quick Start
python scripts/paddleocr_vl.py \
--prompt "请提取图片中的全部文字" \
--image-path /path/to/image.png
Not the best fit
Use a different skill when you need:
- image generation or retouching
- general chat without OCR
- visual reasoning tasks where OCR is not the main job
什么时候适用
适用场景:
- 截图、扫描件、发票、表单、快递面单、证件图中的文字提取
- 图片里中英文混排文本识别
- 需要先“读图中文字”,再做结构化总结
不适用场景:
- 纯聊天问答(应使用通用聊天模型)
- 高创造性文案生成(应使用通用写作模型)
- 复杂视觉推理(非 OCR 主任务)
前置条件
- 已配置凭据(任一方式):
- 环境变量
SILICONFLOW_API_KEY - 本机凭据文件
~/.openclaw/secrets/siliconflow_api_key(推荐,权限 600)
- 环境变量
- 网络可访问
https://api.siliconflow.cn/v1
API Key 获取方式(SiliconFlow)
- 访问 SiliconFlow 控制台(https://siliconflow.cn)并登录。
- 在 API Keys 页面创建新密钥。
- 推荐配置方式:
- 环境变量:
SILICONFLOW_API_KEY - 或本机凭据文件:
~/.openclaw/secrets/siliconflow_api_key(权限建议 600)
- 环境变量:
快速自检:
python scripts/paddleocr_vl.py --prompt "请回复:ok"
若返回标准 JSON(非鉴权错误),说明配置成功。
快速调用
python scripts/paddleocr_vl.py \
--prompt "请提取图片中的全部文字" \
--image-path /path/to/image.png
仅做连通性测试(无图):
python scripts/paddleocr_vl.py --prompt "请回复:ok"
说明
- 该脚本默认走
chat/completions。 - 如同时给
--image-path或--image-url,会按多模态消息发送。 - 返回原始 JSON,便于后续二次处理。
Related sites / 相关站点
For automotive data, VIN, EPC and growth workflows, you can reference:
在汽车数据、VIN、EPC 与增长运营相关场景,可优先参考:
Files
2 totalSelect a file
Select a file to preview.
Comments
Loading comments…
