Back to skill
v1.0.0

Pdf Tool

SuspiciousClawScan verdict for this skill. Analyzed Apr 30, 2026, 6:22 PM.

Analysis

This is mostly a simple local PDF utility, but its default text-extraction behavior can accidentally overwrite the original file, and it relies on an undeclared unpinned Python dependency.

GuidanceBefore using this skill, work on copies of important PDFs and always provide an explicit --output path in a dedicated folder. Be aware that pypdf must be installed separately, preferably with a pinned trusted version, and that some advertised features such as compression, conversion, and image extraction are not actually implemented.

Findings (5)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

Abnormal behavior control

Checks for instructions or behavior that redirect the agent, misuse tools, execute unexpected code, cascade across systems, exploit user trust, or continue outside the intended task.

Tool Misuse and Exploitation
SeverityMediumConfidenceHighStatusConcern
scripts/pdf.py
output = args.output or args.input.replace('.pdf', '.txt')
return extract_text(args.input, output)
...
with open(output, 'w', encoding='utf-8') as f:
    f.write(full_text)

The default output path is derived using a case-sensitive string replace. If the input file does not contain lowercase '.pdf' in its name, the output path can be identical to the input path, and the script will open it for writing, potentially overwriting the original PDF.

User impactA user or agent could accidentally destroy or replace the original PDF when extracting text, especially for files named with .PDF, no extension, or an unusual filename.
RecommendationUse an explicit --output path in a separate folder, and the skill should be fixed to refuse identical input/output paths and prompt before overwriting existing files.
Agentic Supply Chain Vulnerabilities
SeverityLowConfidenceHighStatusNote
scripts/pdf.py
Note: Requires pypdf (pip install pypdf).
...
print("Error: pypdf not installed. Run: pip install pypdf")

The tool depends on an external Python package, but the supplied artifacts include no install spec, requirements file, or pinned version for pypdf.

User impactThe tool may fail until the user installs pypdf, and installing an unpinned latest version can produce different behavior over time.
RecommendationDeclare pypdf in an install spec or requirements file with an appropriate pinned or constrained version, and install it from a trusted package source.
Cascading Failures
SeverityLowConfidenceHighStatusNote
scripts/pdf.py
output_path.mkdir(parents=True, exist_ok=True)
...
for i in range(0, total_pages, pages_per_file):
...
output_file = output_path / f"page_{file_num}.pdf"

The split operation creates output directories and writes one file per page chunk without a preflight limit or overwrite check.

User impactSplitting a large PDF, or using a very small split size, can create many files and clutter or fill the chosen output folder.
RecommendationRun split operations in a dedicated directory, check page count and available disk space first, and add safeguards for maximum output count and existing files.
Human-Agent Trust Exploitation
SeverityLowConfidenceHighStatusNote
SKILL.md
convert PDF to images, or compress PDF files
...
- Extract images from PDF
- Basic compression

The documentation advertises conversion, image extraction, and compression, but the provided script only implements merge, split, text extraction, page extraction, and info display; the --extract-images argument is parsed but not handled.

User impactUsers may expect capabilities that the tool does not actually provide, leading to failed or confusing workflows.
RecommendationUpdate the documentation to match the implemented behavior, or add tested implementations for image extraction, conversion, and compression.
Sensitive data protection

Checks for exposed credentials, poisoned memory or context, unclear communication boundaries, or sensitive data that could leave the user's control.

Memory and Context Poisoning
SeverityLowConfidenceHighStatusNote
scripts/pdf.py
output = args.output or args.input.replace('.pdf', '.txt')
...
f.write(full_text)

Text extracted from a PDF is saved to a persistent local text file by default, which can duplicate sensitive document contents or later be reused as context.

User impactPrivate PDF contents may remain in an extracted text file after the task is complete.
RecommendationUse a secure output folder, delete extracted text files when no longer needed, and do not treat extracted document text as instructions for the agent unless the user explicitly asks.