Skillv0.1.0

ClawScan security

Newton Quotation Pdf Extraction · ClawHub's context-aware review of the artifact, metadata, and declared behavior.

Scanner verdict

BenignApr 23, 2026, 7:10 AM

Verdict: benign
Confidence: high
Model: gpt-5-mini
Summary: The skill's code and instructions match the stated purpose (extracting structured product data from PDF catalogs); it performs local PDF parsing, image extraction/splitting, and exports results to files, with no network calls or secret requests.
Guidance: This skill appears to do what it claims, but review these points before installing/using it: - Dependencies: The scripts require Python libraries (pdfplumber, PyMuPDF/fitz, Pillow, pandas, openpyxl). Install and run in a controlled virtualenv/container; the registry metadata does not declare these packages. - File IO & privacy: The skill extracts text/images from whatever PDFs you provide and writes image files and spreadsheet outputs to disk. Do not run it on PDFs containing sensitive or confidential data unless you are comfortable with local file creation and storage. - Manual configuration: extract_products.py expects you to edit PRODUCTS_IN_ORDER to match your PDF (model, qty, price, num_images). Verify and adjust that mapping before bulk runs to avoid incorrect outputs. - Interactive prompts: scripts use input() to ask for currency; in non-interactive automated runs this can block or fail. If you intend to run autonomously, modify scripts to accept parameters instead of interactive prompts. - No network/secret access detected: The code contains no network requests, remote endpoints, or reads of environment variables/credentials. Still, inspect any edited scripts you run, and run in an isolated environment if you need extra assurance. If you want higher assurance, run the scripts locally on a benign sample PDF first, inspect the produced files, and confirm dependencies and behavior meet your operational/security policies.

Review Dimensions

Purpose & Capability: okThe name/description (PDF quotation/catalog extraction) align with the included scripts: text/table extraction, image extraction/splitting, and product-matching/export. The scripts operate on local PDF files and produce local outputs (images, Excel/CSV/JSON), which is consistent with the stated purpose.
Instruction Scope: noteSKILL.md and the scripts stay within the extraction task. They explicitly require analyzing PDF structure, dynamically extracting text/images, and asking the user for currency. Two practical caveats: the workflow expects manual edits to a PRODUCTS_IN_ORDER list in extract_products.py (the skill instructs users to edit this to match their PDF), and some scripts use interactive input() prompts (ask_currency) which may block in non-interactive agent runs. The skill reads and writes local files (images, Excel, CSV, JSON) — expected for this task but relevant for privacy.
Install Mechanism: noteThere is no install spec (instruction-only), which is low-risk. However, the code depends on Python packages (pdfplumber, PyMuPDF/fitz, Pillow, pandas, openpyxl) that are not declared in metadata or an install step. That is an operational mismatch — users must ensure a Python environment with these libraries is available before running.
Credentials: okThe skill requests no environment variables, no credentials, and no special config paths. The scripts operate on files provided by the user and do not reference secrets or external services, so requested privileges are proportionate.
Persistence & Privilege: okThe skill is not always-enabled (always: false) and does not request persistent system configuration or modify other skills. It writes output files to local directories (product images, spreadsheets) — normal for this use case. Autonomous invocation (model invocation enabled) is the platform default; combined with only local filesystem operations this is not a strong risk signal here.