image2text

Other

Extract text from images using tesseract OCR, supporting local files, URLs, and base64 inputs for text-only AI models without vision capability.

Install

openclaw skills install image2text

image2text

Extract text from images without needing a vision-capable AI model.

Usage

python3 scripts/ocr.py <image path|URL|base64> [--lang <languages>] [--psm <mode>] [--raw]

Parameters

  • --lang: Language codes, comma-separated, default chi_sim+eng
    • chi_sim Simplified Chinese | chi_tra Traditional | eng English | jpn Japanese | kor Korean | and 30+ more
    • Combine: chi_sim+eng
  • --psm: Page segmentation mode, default 6
    • 3 Fully automatic | 6 Block-level | 4 Single line | 11 Sparse text
  • --raw: Output plain text only, no markers

Auto-Detects Input Type

  1. Local path: /Users/xxx/Downloads/xxx.png
  2. Web URL: https://example.com/image.png — OSS temp links work too
  3. Base64: Pasted image data from clipboard — just paste directly

Workflow

  1. Receive image input → auto-detect type (local path / URL / base64)
  2. URL → curl downloads to temp file
  3. Base64 → decode to temp file
  4. Run tesseract OCR
  5. Output plain text

Examples

OCR a Chinese receipt:

python3 scripts/ocr.py ~/Downloads/receipt.png --lang chi_sim

English + Chinese mixed:

python3 scripts/ocr.py https://example.com/doc.jpg --lang chi_sim+eng

Plain text only (no markers):

python3 scripts/ocr.py /path/to/image.png --raw

Requirements

  • tesseract must be installed: brew install tesseract
  • Language packs auto-installed with tesseract
  • On Mac: binary at /opt/homebrew/bin/tesseract
  • Temp files auto-deleted after execution
  • For best accuracy on receipts/screenshots: try --psm 3