⚠️ Privacy Warning - 隐私警告
IMPORTANT - READ BEFORE INSTALLING:
This skill uploads your files to Mistral's cloud servers for OCR processing.
Do NOT use with sensitive or confidential documents unless:
- You trust Mistral's data handling policies
- You have reviewed Mistral's privacy policy
- You accept that file contents will be transmitted and processed remotely
For sensitive documents, use offline/local OCR tools instead.
Mistral OCR Skill
A powerful OCR tool that converts PDF files and images into Markdown, JSON, or HTML formats using Mistral's state-of-the-art OCR API.
Installation
# Clone or download this repository
git clone https://github.com/YZDame/Mistral-OCR-SKILL.git
cd Mistral-OCR-SKILL
# Install dependencies
pip install -r requirements.txt
🔑 API Key Setup (Required)
Get your API key:
👉 https://console.mistral.ai/home
Set the environment variable:
export MISTRAL_API_KEY=your_api_key
CLI Usage
cd scripts
# Process PDF to Markdown
python3 mistral_ocr.py -i input.pdf
# Process PDF to JSON
python3 mistral_ocr.py -i input.pdf -f json
# Specify output directory
python3 mistral_ocr.py -i input.pdf -o ~/my_ocr_results
Arguments
| Flag | Description |
|---|
-i, --input | Input file path (required) |
-f, --format | Output format: markdown/json/html (default: markdown) |
-o, --output | Output directory |
Data Privacy
What happens to your files:
- Files are uploaded to Mistral's OCR API
- Files are processed on Mistral servers
- Processing results are returned to you
- Files are not stored on Mistral servers (per Mistral policy)
For more details, see: https://mistral.ai/privacy-policy
License
MIT