Newton Quotation Pdf Extraction
PassAudited by ClawScan on Apr 23, 2026.
Overview
The skill's code and instructions match the stated purpose (extracting structured product data from PDF catalogs); it performs local PDF parsing, image extraction/splitting, and exports results to files, with no network calls or secret requests.
This skill appears to do what it claims, but review these points before installing/using it: - Dependencies: The scripts require Python libraries (pdfplumber, PyMuPDF/fitz, Pillow, pandas, openpyxl). Install and run in a controlled virtualenv/container; the registry metadata does not declare these packages. - File IO & privacy: The skill extracts text/images from whatever PDFs you provide and writes image files and spreadsheet outputs to disk. Do not run it on PDFs containing sensitive or confidential data unless you are comfortable with local file creation and storage. - Manual configuration: extract_products.py expects you to edit PRODUCTS_IN_ORDER to match your PDF (model, qty, price, num_images). Verify and adjust that mapping before bulk runs to avoid incorrect outputs. - Interactive prompts: scripts use input() to ask for currency; in non-interactive automated runs this can block or fail. If you intend to run autonomously, modify scripts to accept parameters instead of interactive prompts. - No network/secret access detected: The code contains no network requests, remote endpoints, or reads of environment variables/credentials. Still, inspect any edited scripts you run, and run in an isolated environment if you need extra assurance. If you want higher assurance, run the scripts locally on a benign sample PDF first, inspect the produced files, and confirm dependencies and behavior meet your operational/security policies.
