Mistral PDF OCR
PassAudited by VirusTotal on May 12, 2026.
Overview
Type: OpenClaw Skill Name: extracting-mistral-ocr Version: 1.0.0 The OpenClaw skill 'extracting-mistral-ocr' is benign. Its `SKILL.md` clearly defines its purpose (OCR via Mistral API) and requests `Read,Write,Bash(python:*)` permissions, which are necessary for its function. The Python script `scripts/mistral_ocr_extract.py` correctly handles input files/URLs by uploading them to or referencing them with the Mistral API, and writes outputs to a specified directory. There is no evidence of prompt injection, data exfiltration, unauthorized network calls, persistence mechanisms, or other malicious intent. Input sanitization for file paths and API parameters appears robust, relying on the Mistral API for processing and returning safe identifiers.
Findings (0)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
Private or sensitive PDFs/images may be uploaded to Mistral for processing.
Local documents are intentionally sent to the external Mistral API for OCR. This is central to the skill, but users should understand that document contents leave the local environment.
If the PDF is local and not publicly accessible, upload it (the script does this automatically).
Use this only for documents you are allowed to send to Mistral, consider page selection for large or sensitive files, and review Mistral retention/privacy terms.
Anyone running the skill needs a valid Mistral API key, and OCR requests may be billed to that account.
The script uses a Mistral API key from the environment. This is expected for the Mistral OCR service, but it grants access to the user's Mistral account and may incur usage costs.
api_key = os.getenv("MISTRAL_API_KEY")Store the API key securely, use the least-privileged key available, and avoid exposing environment variables in logs or shared shells.
The agent can run the OCR Python script and create output files in the chosen directory.
The skill permits running Python commands and writing output files. This is proportionate for a bundled OCR script, but it is still local command execution.
allowed-tools: "Read,Write,Bash(python:*)"
Run it only on intended input files and direct outputs to a safe, expected folder.
A future SDK version could behave differently from the version the skill author tested.
The skill depends on the external mistralai package using a lower-bound version rather than an exact pinned version. This is common for SDK integrations but can change behavior as new package versions are installed.
mistralai>=1.0.0
Install dependencies from trusted package sources and consider pinning a known-good mistralai version in controlled environments.
OCR outputs may create persistent local copies of sensitive document content.
The skill stores the full OCR response locally, which may include extracted text, tables, annotations, and image data from the source document.
raw_response.json (full OCR response)
Store the output directory securely and delete OCR artifacts when they are no longer needed.
