PDF Text Extractor
Analysis
The skill does not show exfiltration, persistence, or destructive behavior; it mainly reads user-selected PDFs, but its dependency and OCR claims are inconsistent and extracted document text should be treated as sensitive.
Findings (3)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
Checks for instructions or behavior that redirect the agent, misuse tools, execute unexpected code, cascade across systems, exploit user trust, or continue outside the intended task.
const fileData = fs.readFileSync(pdfPath);
The skill reads the file path supplied by the caller. This is expected for a PDF extractor, but it means any selected PDF's contents can be brought into the agent output.
"dependencies": { "pdfjs-dist": "^3.11.174" }The package declares an npm dependency even though the skill description says zero dependencies and the registry lists no install spec. The dependency is aligned with PDF parsing, but the dependency footprint is not consistently disclosed.
Checks for exposed credentials, poisoned memory or context, unclear communication boundaries, or sensitive data that could leave the user's control.
- Prepare content for LLM processing
The documented workflow explicitly sends extracted document text toward LLM analysis. PDF text is untrusted user/document content and may contain sensitive information or prompt-like instructions.
