PDF OCR Using Gemini LLM
Security checks across static analysis, malware telemetry, and agentic risk
Overview
This skill appears to do what it says—OCR PDFs with Gemini—but it sends PDF page files to Google and uses your Google API key.
Install only if you are comfortable sending the target PDFs to Google Gemini and using your Google API key for the requests. Avoid highly sensitive documents unless this matches your privacy requirements, and consider pinning dependencies in the virtual environment.
Static analysis
No static analysis findings were reported for this release.
VirusTotal
VirusTotal findings are pending for this skill version.
Risk analysis
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
Any text or images in the PDF may be processed by Google's service, which may matter for private, confidential, or regulated documents.
The skill clearly discloses that PDF page contents leave the local environment and are sent to Google for OCR.
**Full page images/files are sent to Google's API.** PDFs are split into single-page files and each page is uploaded to Google Gemini for OCR.
Use this only for documents you are comfortable sending to Google, and review Google/API data handling terms for sensitive material.
The skill can use your Google API account quota or billing for OCR requests.
The code uses the provided Google API key to create a Gemini client, which is expected for this OCR integration.
self._client = genai.Client(api_key=self._api_key)
Use a restricted API key where possible, keep it out of logs and shared shells, and rotate it if exposed.
Installing later dependency versions could introduce compatibility or supply-chain risk.
Dependencies are installed by package name without version pins; this is common but means future dependency changes could affect behavior.
install:
- kind: uv
package: google-genai
- kind: uv
package: pymupdf
- kind: uv
package: pydanticInstall in an isolated virtual environment and consider pinning dependency versions before production or sensitive use.
