Back to skill
Skillv0.1.0
ClawScan security
PR's PDF Agent · ClawHub's context-aware review of the artifact, metadata, and declared behavior.
Scanner verdict
SuspiciousMar 5, 2026, 4:54 AM
- Verdict
- suspicious
- Confidence
- medium
- Model
- gpt-5-mini
- Summary
- The package generally matches a self-hosted PDF tool, but there are several coherence issues and surprising capabilities (remote HTML fetch, arbitrary external LLM/command invocation, undeclared env/binary expectations) that you should review before installing or running it on sensitive data.
- Guidance
- This package implements a comprehensive self-hosted PDF CLI (merging, splitting, OCR, conversions, redaction, an 'agent' mode) and the code mostly matches that purpose — but pay attention to these issues before running it: - Dependency mismatch: The registry lists no required binaries/env but SKILL.md and code expect many external tools (gs, qpdf, pdftoppm, soffice, ocrmypdf, wkhtmltopdf/Chrome, and optionally ollama). Ensure those are installed intentionally. - Network and external execution: html_to_pdf can fetch remote URLs; core.llm can run arbitrary commands or call 'ollama' (a local LLM runner). Running the CLI with remote sources or LLM provider=command may cause the tool to access the network or execute untrusted commands. Treat any use that passes URLs or enables an external LLM/command as potentially exfiltrative. - Undeclared env usage: The code reads PDFAGENT_SOFFICE_TIMEOUT (and subprocess code supports custom env). Review environment variables and avoid exposing secrets to the runtime environment you use for this tool. - Run in isolation first: Test the tool in a sandbox / disposable VM, with non-sensitive PDFs, and confirm behavior (doctor command reports available binaries). Inspect CLI flags (especially anything enabling LLM/agent mode or remote fetching) before using on private data. - Origin and trust: The source 'homepage' and origin are unknown. If you need to run this in production or on sensitive documents, consider auditing the remaining omitted files, or prefer a vetted implementation from a known source. If you want, I can: (1) list every place the code can perform network I/O or spawn external processes, (2) locate where the CLI accepts LLM provider/command options, or (3) highlight any remaining omitted files for further review.
Review Dimensions
- Purpose & Capability
- concernName/description promise self-hosted PDF operations and the repo code implements that. However the skill metadata declares no required binaries or env vars while SKILL.md and the code require/expect uv, Ghostscript (gs), qpdf, poppler (pdftoppm), soffice (LibreOffice), ocrmypdf, wkhtmltopdf/Chrome, and optionally ollama and other Python libs. The registry declarations (no requirements) are inconsistent with the actual capabilities and dependencies.
- Instruction Scope
- concernSKILL.md focuses on local disk-based PDF processing, but the code can fetch remote HTML (urllib.request.urlopen in html_to_pdf) and can invoke external commands/LLM providers (core.llm uses arbitrary commands or 'ollama' via subprocess). Those behaviors allow network I/O and arbitrary process execution that go beyond simple file manipulation; the documentation does mention some of these tools but the risk/implications are not made explicit in the SKILL.md.
- Install Mechanism
- okNo install spec is provided (instruction-only for running via 'uv run'), so nothing is downloaded or installed automatically by the registry. The presence of source files means code will execute locally when run, but there is no remote installer or archive URL to review.
- Credentials
- concernThe registry declares no required env vars, but code reads at least one env var (PDFAGENT_SOFFICE_TIMEOUT) and the subprocess execution paths allow passing custom env to commands. The tool also exposes options to call external LLMs or arbitrary commands; those uses can require secrets or expose sensitive data if misconfigured. Overall requested/used environment access is under-declared relative to what the code can leverage.
- Persistence & Privilege
- okThe skill is not always-enabled, does not request to modify other skills, and has no install hook. It writes usage logs optionally to a --usage-file, creates per-command output files and local LibreOffice profile directories, which is normal for a CLI tool.
