Private Document AI with OpenVINO
PassAudited by ClawScan on May 11, 2026.
Overview
This appears to be a local document-processing skill; the main cautions are reviewing dependencies, protecting extracted artifacts, and checking generated code before running it.
Install and run this in a virtual environment, review any optional PaddleOCR-VL/OpenVINO wheel or model download before enabling it, process only documents you intend to parse, store artifacts in a protected local folder, and review generated code or notebooks before running them. Some source in the supplied review was truncated, so inspect the full package if you require high assurance.
Findings (4)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
If pointed at the wrong file or output folder, private document contents could be parsed and stored somewhere unintended.
The main workflow intentionally accepts local input and output paths. This is expected for document OCR, but it gives the agent broad file/path handling when the user supplies paths.
python "{baseDir}/scripts/run_skill.py" --mode to-data --file "/absolute/path/to/invoice.pdf" --out "/absolute/path/to/artifacts/invoice_data"Use explicit file paths, choose a dedicated local output folder, and avoid cloud-synced or shared folders for confidential documents.
Installing unreviewed packages or model tooling could run third-party code in the same environment used for private documents.
The skill relies on external Python packages and an optional third-party OCR package. The artifacts disclose this and recommend review, but dependency provenance still matters.
PyMuPDF>=1.24.0 ... openvino>=2026.0.0 ... The third-party paddleocr_vl_openvino package is intentionally NOT installed ... Review the source or wheel first
Install in a virtual environment, prefer reviewed or pinned packages, and only enable model downloads or install OCR wheels from trusted sources.
Generated scaffolds may be incomplete or unsafe if run, deployed, or connected to real systems without review.
The skill can generate executable code or notebooks, but the artifacts explicitly frame them as drafts and do not show automatic execution of generated code.
Typical outputs ... `task_output/notebook.ipynb` ... `app.jsx`, `index.html`, `styles.css` ... Treat all generated code and notebooks as drafts. Review them before running
Inspect generated code and notebooks before execution, publishing, or connecting them to real data or services.
Sharing artifact folders may reveal private document contents, filenames, local paths, and document fingerprints.
The generated parse output records source metadata, including the resolved local input path and file hash, alongside parsed document content.
"source": { "input_path": str(config.file), "input_type": input_type, "filename": config.file.name, "sha256": file_hash }Treat output folders as sensitive, redact artifacts before sharing, and delete outputs when they are no longer needed.
