PaddleOCR Text Recognition
PassAudited by ClawScan on May 1, 2026.
Overview
This OCR skill appears coherent and purpose-built, but users should know it sends OCR inputs to a configured remote API and saves raw OCR results locally by default.
This skill is reasonable to install if you trust the configured PaddleOCR API endpoint and are comfortable sending selected images or PDFs there for OCR. Keep the access token private, avoid submitting highly sensitive documents unless appropriate, and use --stdout or remove the temp result file if you do not want OCR output left on disk.
Findings (4)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
Running the skill may cause uv to fetch the httpx dependency needed for API requests.
The script declares an inline dependency that uv will resolve when run. This is normal for the documented workflow, but users are relying on the package source and version resolution.
# dependencies = [ # "httpx>=0.24.0", # ]
Use uv from a trusted installation and run the skill in an environment where package installation from the configured Python package index is acceptable.
The skill can make OCR API calls using the configured PaddleOCR token.
The skill uses the configured PaddleOCR access token to authenticate API requests, which is expected for the OCR service integration.
"Authorization": f"token {token}"Store the token securely, use the least-privileged or service-specific token available, and rotate it if it is exposed.
Images or PDFs submitted for OCR may leave the local machine and be processed by the configured remote OCR service.
For local files, the script base64-encodes the user-provided file and posts it to the configured OCR API endpoint. This is core to the OCR purpose and is disclosed by the skill's internet/API requirements.
params = {"file": _load_file_as_base64(fp)} ... resp = client.post(api_url, json=params, headers=headers)Only OCR files you are comfortable sending to the configured PaddleOCR endpoint, especially if they contain private, financial, medical, or credential information.
Extracted text and raw OCR results may remain on disk after the task completes.
The skill intentionally persists raw OCR output by default. This is disclosed and useful for downstream parsing, but the saved JSON may contain sensitive recognized text and provider response data.
Default behavior: save raw JSON to a temp file ... <system-temp>/paddleocr/text-recognition/results/result_<timestamp>_<id>.json
Use --stdout for sensitive one-off OCR jobs or delete the saved temp JSON file when it is no longer needed.
