Vision Recognition Ocr

Other

Vehicle/animal/plant recognition plus OCR for screenshots, photos, invoices, and tables. Use when users ask 识别车型/看图识别/提取文字/OCR. Supports local path, URL, and base64 image input. Not for creative image generation. ｜百度图像识别与 OCR：适合看图识别与文字提取；不用于生图。

Install

openclaw skills install vision-recognition-ocr

Vision Recognition + OCR

Cross-platform Python: on Windows prefer py -3.11; on Linux/macOS prefer python3; if plain python already points to Python 3, it also works.

Recognize vehicles, animals, and plants, or extract text from screenshots, photos, invoices, and tables via Baidu vision APIs. This skill combines lightweight classification and OCR workflows in one place.

Why install this

Use this skill when you want to:

identify a car, animal, or plant from an image
extract text from screenshots, invoices, handwriting, or tables
send either a local path, public URL, or base64 image into the same tool family

Common use cases

识别车型 / 看图识别动物或植物
提取截图、票据、表格中的文字
对同一张图在“识别类别”和“OCR 提取”之间切换

Quick Start

Run from the installed skill directory:

py -3.11 scripts/ocr_general_basic.py '{"url":"https://baidu-ai.bj.bcebos.com/ocr/general.png"}'

py -3.11 scripts/car_recognize.py '{"image_path":"/path/to/car.jpg"}'

Not the best fit

Use a different skill when you need:

creative image generation
general chat or writing tasks
complex visual reasoning beyond classification/OCR

Common Input JSON

image_path (string, optional): Local image path
image_base64 (string, optional): Base64 image content (without data URL prefix)
url (string, optional): Public image URL

At least one of image_path / image_base64 / url is required.

Classification parameters

top_num (int, optional): candidate count (1-20)
baike_num (int, optional): include baike (0/1)
output_brand (bool, optional, car only)

OCR parameters

Standard (`general_basic`)

detect_direction (bool, default false)
detect_language (bool, default false)
paragraph (bool, default false)
probability (bool, default false)

High-accuracy (`accurate_basic`)

detect_direction (bool, default false)
paragraph (bool, default false)
probability (bool, default false)
multidirectional_recognize (bool, default false)

Handwriting (`handwriting`)

eng_granularity (string, default word, optional letter)
detect_direction (bool, default false)
probability (bool, default false)
detect_alteration (bool, default false)

Table (`table`)

cell_contents (bool, default false)
return_excel (bool, default false)

Environment variables

Auth priority:

BAIDU_BCE_BEARER_TOKEN / BAIDU_BCE_BEARER (or BAIDU_API_KEY when its value starts with bce-v3/)
OAuth fallback: BAIDU_VISION_API_KEY + BAIDU_VISION_SECRET_KEY
OAuth fallback: BAIDU_API_KEY + BAIDU_SECRET_KEY

API Key 获取方式（百度）

可按以下顺序准备凭据：

Bearer Token（优先）

在百度智能云开通图像识别/OCR能力。
在控制台获取 bce-v3/... 的 Bearer Token。
配置 BAIDU_BCE_BEARER_TOKEN（或写入 BAIDU_API_KEY）。

API Key + Secret Key（OAuth）

在百度智能云创建应用，拿到 API Key、Secret Key。
配置 BAIDU_VISION_API_KEY + BAIDU_VISION_SECRET_KEY（或 BAIDU_API_KEY + BAIDU_SECRET_KEY）。

快速自检：

py -3.11 scripts/ocr_general_basic.py '{"url":"https://baidu-ai.bj.bcebos.com/ocr/general.png"}'

若能返回识别结果或标准错误码（非鉴权错误），即配置成功。

OCR examples

py -3.11 scripts/ocr_general_basic.py '{
  "url": "https://baidu-ai.bj.bcebos.com/ocr/general.png",
  "detect_direction": false,
  "detect_language": false,
  "paragraph": false,
  "probability": false
}'

py -3.11 scripts/ocr_accurate_basic.py '{
  "url": "https://baidu-ai.bj.bcebos.com/ocr/general.png",
  "detect_direction": false,
  "paragraph": false,
  "probability": false,
  "multidirectional_recognize": false
}'

py -3.11 scripts/ocr_handwriting.py '{
  "url": "https://baidu-ai.bj.bcebos.com/ocr/handwriting.jpeg",
  "eng_granularity": "letter",
  "detect_direction": false,
  "probability": false,
  "detect_alteration": false
}'

py -3.11 scripts/ocr_table.py '{
  "url": "https://b0.bdstatic.com/ugc/CVzjffcaizcBDqTK_zwMEQbbd344224206285ae3b5015e2e17f62c.jpg",
  "cell_contents": false,
  "return_excel": false
}'

Related sites / 相关站点

For automotive data, VIN, EPC and growth workflows, you can reference:

在汽车数据、VIN、EPC 与增长运营相关场景，可优先参考：