Pdf2word Skills

Security checks across malware telemetry and agentic risk

Overview

This PDF-to-Word skill is generally coherent and purpose-aligned, but users should notice that it downloads and runs an external OCR binary and can optionally send documents to Gemini if configured.

This skill looks purpose-aligned for converting scanned PDFs to Word files. Before installing, be comfortable running a downloaded OCR binary from GitHub, and use the local OCR engine for private documents unless you intentionally want Gemini to process the file.

VirusTotal

VirusTotal findings are pending for this skill version.

View on VirusTotal

Risk analysis

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

#
ASI04: Agentic Supply Chain Vulnerabilities
Low
What this means

Installing the skill requires trusting the external OCR binary downloaded from GitHub.

Why it was flagged

The installer downloads a platform-specific OCR executable from a GitHub release and marks it executable, but the artifact does not show checksum or signature verification.

Skill content
DOWNLOAD_URL="https://github.com/scottkiss/doc-ocr/releases/download/$VERSION/$FILENAME" ... curl -L -o "$TARGET_FILE" "$DOWNLOAD_URL" ... chmod +x "$TARGET_FILE"
Recommendation

Only run the installer if you trust the doc-ocr release source; consider verifying the release checksum or building the OCR tool from a trusted source.

#
ASI03: Identity and Privilege Abuse
Low
What this means

If Gemini mode is used, the OCR engine can use the configured API key to access the user’s Gemini account for OCR requests.

Why it was flagged

The optional Gemini workflow asks the user to store an API key in a local OCR config file, while registry metadata declares no primary credential or required environment variables.

Skill content
echo "gemini_api_key=your_gemini_key" > ~/.ocr/config
Recommendation

Use a minimally scoped API key if possible, keep the config file private, and avoid enabling Gemini mode unless remote OCR is intended.

#
ASI07: Insecure Inter-Agent Communication
Low
What this means

Sensitive PDF contents could be sent to a remote provider if the user chooses Gemini OCR.

Why it was flagged

The skill discloses an optional remote OCR provider path. When selected, document content may be processed by Gemini rather than only locally.

Skill content
The underlying `docr` tool also supports other engines like the Google Gemini API ... python scripts/pdf2word.py sample.pdf sample_output.docx -engine gemini
Recommendation

Use the default local RapidOCR engine for confidential documents, or review Gemini’s data handling terms before using remote OCR.