When dealing with text within an image, the system automatically recognizes it as an OCR (Optical Character Recognition) task and applies the corresponding capabilities.
v1.0.0OCR (Optical Character Recognition) tool using Tesseract for extracting text from images. Use when: (1) processing screenshots, charts, or documents in image...
MIT-0
Security Scan
OpenClaw
Benign
high confidencePurpose & Capability
Name/description, required binaries (tesseract), install entries (apt/brew/choco/winget), and included scripts all match an OCR tool focused on Chinese/English financial images.
Instruction Scope
SKILL.md and example code only instruct running tesseract, local preprocessing, and local parsing of OCR text (grep/regex). The skill references a local media directory for integration (HOME/.openclaw/media/inbound/) which is consistent with OpenClaw integration and the stated use case.
Install Mechanism
Install metadata points to standard package managers (brew/apt/choco/winget) for installing Tesseract; no arbitrary downloads or extracted archives are used.
Credentials
The skill requires no environment variables, no credentials, and no config paths beyond expecting a local tesseract binary and optional tessdata language files—proportionate for OCR functionality.
Persistence & Privilege
always is false and the skill doesn't request permanent or cross-skill configuration changes. It runs local commands and scripts without elevating privileges or modifying other skills.
Assessment
This skill appears to do exactly what it claims: run Tesseract locally and parse the resulting text. Before installing, ensure you have (or want) a local Tesseract installation and any required language packs (chi_sim) for Chinese OCR. Review and, if desired, run the included test_ocr.py to verify behavior on sample images. Be mindful of privacy: OCR will read any image you feed it (including potentially sensitive content) and produce plain-text output; do not point it at images you don't want converted or stored. If you deploy this in an automated agent, note the agent can invoke local tesseract commands—there are no network exfiltration steps in the provided code, but still avoid giving it images that contain secrets you wouldn't want processed.Like a lobster shell, security has layers — review code before you run it.
latest
License
MIT-0
Free to use, modify, and redistribute. No attribution required.
Runtime requirements
🔍 Clawdis
Binstesseract
Install
Install Tesseract OCR (brew)
Bins: tesseract
SKILL.md
OCR Tool Skill
Use Tesseract OCR to extract text from images, particularly useful for financial charts, announcements, reports, and screenshots containing Chinese and English text.
When to Use
✅ USE this skill when:
- Processing screenshots of financial charts or announcements
- Extracting text from images containing Chinese/English text
- Analyzing "公告全知道" or similar financial announcement images
- Processing images with tabular data or structured information
- Extracting text from charts, reports, or documents in image format
When NOT to Use
❌ DON'T use this skill when:
- Text files are already available (use
readtool) - PDF files (use other PDF extraction tools)
- Images without text content
- When Tesseract is not installed
Setup
# Verify Tesseract installation
tesseract --version
# Install language packs if needed (for Chinese)
# Windows: Download chi_sim.traineddata from https://github.com/tesseract-ocr/tessdata
# Place in: C:\Program Files\Tesseract-OCR\tessdata\
Basic Usage
Extract Text from Image
# Basic OCR (English)
tesseract image.png output.txt
# Chinese OCR
tesseract image.png stdout -l chi_sim
# Chinese + English OCR
tesseract image.png stdout -l chi_sim+eng
# Specify output format
tesseract image.png output -l chi_sim+eng pdf txt
Common Patterns for Financial Analysis
# Extract text from financial announcement images
tesseract announcement.png stdout -l chi_sim+eng | grep -E "公司|股份|增长|利润"
# Process multiple images
for img in *.png; do
echo "=== $img ==="
tesseract "$img" stdout -l chi_sim+eng
done
# Save OCR results
tesseract financial_chart.png financial_analysis.txt -l chi_sim+eng
Integration with OpenClaw
Example: Process Telegram Image Messages
# When receiving image messages via Telegram
# 1. Image is automatically downloaded to media directory
# 2. Use OCR to extract text
# 3. Analyze extracted content
# Find latest image
latest_img=$(ls -t "$HOME/.openclaw/media/inbound/"*.png | head -1)
# Extract text
tesseract "$latest_img" stdout -l chi_sim+eng
# Analyze for specific patterns (company names, financial data)
tesseract "$latest_img" stdout -l chi_sim+eng | grep -oE "#[^ ]+|【[^】]+】"
Example: Financial Announcement Analysis
#!/bin/bash
# analyze_financial_image.sh
IMAGE="$1"
OUTPUT="analysis_$(date +%Y%m%d_%H%M%S).txt"
echo "=== OCR Analysis Report ===" > "$OUTPUT"
echo "Image: $IMAGE" >> "$OUTPUT"
echo "Time: $(date)" >> "$OUTPUT"
echo "" >> "$OUTPUT"
# Extract text
echo "=== Extracted Text ===" >> "$OUTPUT"
tesseract "$IMAGE" stdout -l chi_sim+eng >> "$OUTPUT"
echo "" >> "$OUTPUT"
echo "=== Key Information ===" >> "$OUTPUT"
# Extract company names
echo "Company Names:" >> "$OUTPUT"
tesseract "$IMAGE" stdout -l chi_sim+eng | grep -oE "[A-Za-z0-9]+股份|[A-Za-z0-9]+科技|[A-Za-z0-9]+集团" | sort -u >> "$OUTPUT"
# Extract stock codes
echo "" >> "$OUTPUT"
echo "Stock Codes:" >> "$OUTPUT"
tesseract "$IMAGE" stdout -l chi_sim+eng | grep -oE "[0-9]{6}\.[A-Z]{2,4}" | sort -u >> "$OUTPUT"
# Extract financial metrics
echo "" >> "$OUTPUT"
echo "Financial Metrics:" >> "$OUTPUT"
tesseract "$IMAGE" stdout -l chi_sim+eng | grep -oE "同比增长[0-9.]+%|利润[0-9.]+亿元|增长[0-9.]+%" | sort -u >> "$OUTPUT"
echo "Analysis saved to: $OUTPUT"
Advanced Usage
Multiple Language Support
# Chinese Simplified
tesseract image.png stdout -l chi_sim
# Chinese Traditional
tesseract image.png stdout -l chi_tra
# Japanese
tesseract image.png stdout -l jpn
# Korean
tesseract image.png stdout -l kor
# Multiple languages
tesseract image.png stdout -l chi_sim+eng+jpn
Image Preprocessing (Improve Accuracy)
# Convert to grayscale (using ImageMagick)
convert image.png -grayscale Rec709Luma grayscale.png
tesseract grayscale.png stdout -l chi_sim+eng
# Increase contrast
convert image.png -contrast -contrast enhanced.png
tesseract enhanced.png stdout -l chi_sim+eng
# Remove noise
convert image.png -despeckle denoised.png
tesseract denoised.png stdout -l chi_sim+eng
Batch Processing
# Process all PNG images in directory
for img in *.png; do
base=$(basename "$img" .png)
tesseract "$img" "output_${base}.txt" -l chi_sim+eng
echo "Processed: $img -> output_${base}.txt"
done
# Process with parallel (if available)
find . -name "*.png" -print0 | parallel -0 tesseract {} {.}.txt -l chi_sim+eng
Common Use Cases
1. Financial Announcements ("公告全知道")
# Extract key information from financial announcements
tesseract announcement.png stdout -l chi_sim+eng | \
grep -A2 -B2 -E "公司|股份|增长|利润|合同|中标|收购"
# Find company mentions
tesseract announcement.png stdout -l chi_sim+eng | \
grep -oE "#[^ ]+|【[^】]+】|([A-Za-z0-9\u4e00-\u9fa5]+股份)"
2. Stock Charts and Tables
# Extract stock data from charts
tesseract stock_chart.png stdout -l eng | \
grep -E "[0-9]+\.[0-9]+|[0-9]+%"
# Process tabular data
tesseract table.png stdout -l chi_sim+eng | \
awk 'BEGIN {FS="[[:space:]]{2,}"} {for(i=1;i<=NF;i++) printf "|%s", $i; print "|"}'
3. Document Screenshots
# Extract structured document content
tesseract document.png stdout -l chi_sim+eng | \
sed -n '/^[0-9]\+\./p' # Extract numbered items
# Extract headings
tesseract document.png stdout -l chi_sim+eng | \
grep -E "^#|^【|^("
Troubleshooting
Common Issues
-
Poor OCR accuracy
- Preprocess images (grayscale, contrast enhancement)
- Use appropriate language packs
- Ensure image resolution is sufficient (300 DPI recommended)
-
Missing Chinese characters
- Verify chi_sim.traineddata is installed
- Use
-l chi_sim+engfor mixed content - Check image quality and font clarity
-
Tesseract not found
- Install Tesseract via package manager
- Add Tesseract to PATH environment variable
- Verify installation with
tesseract --version
Improving Accuracy
# Use custom configuration
tesseract image.png stdout -l chi_sim+eng --psm 6 # Assume uniform block of text
tesseract image.png stdout -l chi_sim+eng --psm 11 # Sparse text
# PSM modes:
# 3 = Fully automatic page segmentation, but no OSD (default)
# 6 = Assume a single uniform block of text
# 11 = Sparse text. Find as much text as possible in no particular order
# 12 = Sparse text with OSD
# Use OEM (OCR Engine Mode)
tesseract image.png stdout -l chi_sim+eng --oem 1 # LSTM only
tesseract image.png stdout -l chi_sim+eng --oem 2 # Legacy + LSTM
tesseract image.png stdout -l chi_sim+eng --oem 3 # Default
Performance Tips
- For batch processing, consider parallel execution
- Cache OCR results for repeated analysis
- Preprocess images to improve speed and accuracy
- Use appropriate PSM mode for image type
Integration Examples
With Python Scripts
import subprocess
import re
def ocr_image(image_path, lang='chi_sim+eng'):
"""Extract text from image using Tesseract"""
result = subprocess.run(
['tesseract', image_path, 'stdout', '-l', lang],
capture_output=True,
text=True,
encoding='utf-8'
)
return result.stdout
# Example usage
text = ocr_image('announcement.png')
companies = re.findall(r'#(\S+)', text)
print(f"Found companies: {companies}")
With Shell Scripts
#!/bin/bash
# analyze_financial_images.sh
analyze_image() {
local img="$1"
echo "Analyzing: $img"
# Extract text
text=$(tesseract "$img" stdout -l chi_sim+eng)
# Extract key information
echo "=== Summary ==="
echo "Companies: $(echo "$text" | grep -oE '#[^ ]+' | tr '\n' ' ')"
echo "Stock Codes: $(echo "$text" | grep -oE '[0-9]{6}\.[A-Z]{2,4}' | tr '\n' ' ')"
echo "Financial Terms: $(echo "$text" | grep -oE '同比增长|利润|增长|合同' | sort -u | tr '\n' ' ')"
}
# Process all images
for img in "$@"; do
analyze_image "$img"
echo ""
done
Notes
- Tesseract works best with clean, high-contrast images
- Chinese OCR requires chi_sim/chi_tra language data files
- For financial charts with small text, ensure image resolution is sufficient
- Consider image preprocessing for better results with screenshots
- Always verify OCR results, especially for critical financial data
Files
6 totalSelect a file
Select a file to preview.
Comments
Loading comments…
