Install
openclaw skills install ollama-ocrUse Ollama's vision/OCR models to recognize text from images. Supports glm-ocr, llava, moondream, and llama3.2-vision models. Ideal when you need local offline OCR without relying on cloud APIs.
openclaw skills install ollama-ocrUse this skill when you need to recognize text from images using Ollama's local vision/OCR models. No internet required - fully offline OCR.
| Model | Best For | Size |
|---|---|---|
glm-ocr:latest | Chinese text OCR | ~2.2GB |
llava:7b | General image understanding | ~4.7GB |
moondream | Lightweight vision model | ~1.5GB |
llama3.2-vision:latest | Large vision model | ~7GB+ |
Default: http://172.17.0.2:11434 (Docker container to host gateway)
Note: Endpoint is pre-configured for OpenClaw running in Docker accessing host Ollama. Adjust OLLAMA_HOST in ollama_ocr.py if your setup differs.
python3 ollama_ocr.py /path/to/image.jpg [model_name]
Examples:
python3 ollama_ocr.py receipt.png glm-ocr:latest
python3 ollama_ocr.py screenshot.jpg llava:7b
from ollama_ocr import ollama_ocr
# Basic OCR with default model (glm-ocr)
result = ollama_ocr('/path/to/image.jpg')
# Specify model
result = ollama_ocr('/path/to/image.jpg', 'glm-ocr:latest')
print(result)
glm-ocr works best for Chinese textollama pull glm-ocr:latest)