Install
openclaw skills install super-ocrProduction-grade OCR with intelligent engine selection. Tesseract (lightweight, fast) and PaddleOCR (high accuracy, Chinese-optimized). Use when extracting text from images, processing Chinese documents, needing confidence scores, or working with mixed Chinese/English content.
openclaw skills install super-ocrSuper OCR is a production-grade optical character recognition tool that intelligently selects the best engine for your needs:
The skill automatically selects the optimal engine:
| Scenario | Selected Engine | Why |
|---|---|---|
| Simple text, English only | Tesseract | Faster, lighter dependency |
| Chinese content, high accuracy needed | PaddleOCR | Better Chinese support, 98%+ accuracy |
| Low confidence from Tesseract | PaddleOCR (fallback) | Quality assurance |
Users can explicitly choose an engine:
--engine tesseract - Use Tesseract only--engine paddle - Use PaddleOCR only--engine auto - Auto-select (default)This skill requires the following dependencies:
pip install paddleocr paddlepaddle pytesseract pillow opencv-python numpy
macOS:
# Tesseract
brew install tesseract
# PaddleOCR
pip install paddleocr paddlepaddle
Ubuntu/Debian:
# Tesseract
sudo apt update && sudo apt install tesseract-ocr
# PaddleOCR
pip install paddleocr paddlepaddle
Windows:
# Download Tesseract from: https://github.com/UB-Mannheim/tesseract/wiki
pip install paddleocr paddlepaddle pytesseract pillow opencv-python numpy
# Auto mode (recommended) - runs all available engines
cd path/to/super-ocr
python scripts/main.py --image path/to/image.png
# Force Tesseract only
python scripts/main.py --image document.jpg --engine tesseract
# Force PaddleOCR (high accuracy Chinese)
python scripts/main.py --image chinese_menu.png --engine paddle
# Run all engines (macOS only: Tesseract + PaddleOCR + MacVision)
python scripts/main.py --image complex_doc.png --engine all
# Batch processing with output directory
python scripts/main.py --images ./images/*.png --output ./results --verbose
# Check dependencies and auto-install
python scripts/dependencies.py --check --install
This skill uses a capabilities-based structure with multiple execution modes:
The skill includes a decision tree that analyzes:
See scripts/engine_selector.py for implementation details.
Tesseract Engine (scripts/tesseract_ocr.py):
PaddleOCR Engine (scripts/paddle_ocr.py):
Supports multiple output formats:
| Format | Content | Use Case |
|---|---|---|
| Text only | Clean extracted text | Simple search/grep |
| Structured | Text + positions | Data extraction |
| JSON | Full metadata + confidence | API integration |
| Verbose | Debug info | Quality assurance |
main.py - Main entry point, CLI interface (supports multi-engine)dependencies.py - Auto-install and validationoutput_formatter.py - Multiple output format supportengine/ - OCR engine implementations
selector.py - Intelligent engine selection logictesseract.py - Tesseract engine wrapperpaddle.py - PaddleOCR engine wrappermacvision.py - macOS Vision OCR (macOS only)preprocessing/ - Image preprocessing utilities
preprocessor.py - Denoising, enhancement, binarizationThe dependencies.py module handles:
paddleocr, paddlepaddle, pytesseract, cv2)Use this when setting up a new environment with python scripts/dependencies.py --check --install
Create config.yaml for persistent settings:
default_engine: auto
confidence_threshold: 0.8
output_format: json
preprocess:
denoise: true
enhance_contrast: true
Process multiple images:
python scripts/ocr.py --images ./images/*.png --output ./results
Use as a Python library:
from super_ocr import OCRProcessor
processor = OCRProcessor(engine='auto')
result = processor.extract('image.png')
print(result.text)
print(result.confidence)
| Engine | Init Time | Per-Image | Memory | Best For |
|---|---|---|---|---|
| Tesseract | ~200ms | ~50ms | ~100MB | Quick extraction |
| PaddleOCR | ~3s | ~500ms | ~500MB | High accuracy |
Initialize once, reuse processor for batch processing.