Skillv1.0.0

ClawScan security

LocalDataAI · ClawHub's context-aware review of the artifact, metadata, and declared behavior.

Scanner verdict

BenignMar 16, 2026, 5:23 AM

Verdict: Benign
Confidence: high
Model: gpt-5-mini
Summary: The skill is internally consistent with its stated purpose (local/offline document parsing and local-model inference); I found no evidence of hidden exfiltration or unrelated credential requests, though there are a few operational/implementation notes you should review before installing in production or air-gapped environments.
Guidance: This project appears to be what it says: an on-prem/local document parsing + local-model AI toolkit. Before installing or deploying, consider the following: 1) Initial model files are large and require network access or manual transfer—if you need strict air-gap guarantees, download models on an allowed machine and copy them into ./models rather than running automated downloads in the target network. 2) The requirements pull heavy native/GPU packages (torch, paddlepaddle, faiss, etc.); install in a controlled virtualenv/conda environment and verify binary builds for your OS. 3) Review scripts/local_ai_engine.py, scripts/sandbox.py and any other code not fully reviewed here to confirm they do not perform network calls in your deployment (README claims runtime is offline, but a final code review ensures no accidental outbound requests). 4) The audit logger (ComplianceLogger) encrypts logs but currently derives a key using a fixed password and a random salt at initialization; this can make logs unreadable across restarts—if you rely on persistent audit logs, supply a stable encryption_key when constructing the logger or patch the key management logic. 5) Check model licenses and hashes for the downloaded models (verify integrity and licensing before using in regulated environments). 6) Run the project in an isolated/test environment first to validate behavior and to confirm the sandbox/network restrictions behave as you expect.

Review Dimensions

Purpose & Capability: okName/description (local private/offline file AI) matches the code and docs: parsers, local vector store, local LLM model configuration, OCR, large-file handler, sandbox and compliance logger are all relevant to the stated purpose. There are no unrelated required environment variables, binaries, or config paths requested in metadata.
Instruction Scope: noteSKILL.md and README focus on local/offline usage and document parsing; runtime instructions instruct pip installing dependencies and running scripts/download_models.py to fetch models. The project does require an initial model download (or manual placement of model files) but otherwise runs locally. One implementation detail to review: the compliance_logger generates an encryption key internally by deriving from a constant password and a random salt on each process start, which will make previously written encrypted logs unreadable by later processes (operational bug that undermines audit portability/continuity).
Install Mechanism: noteThere is no platform installer provided (no install spec), so install relies on pip installing requirements.txt and optionally running download_models.py which points to well-known hosts (Hugging Face assets and PaddleOCR). The download script currently prints/manual-download hints rather than automatically pulling large model files; the URLs are to recognized project hosts (huggingface.co, paddleocr.bj.bcebos.com). The requirements list heavy native/GPU packages and will pull large dependencies; review and install in a controlled environment.
Credentials: okThe skill requests no environment variables or external credentials in metadata. The only 'secret' area is the compliance logger's optional encryption_key parameter (if provided, logs can be made deterministic); by default the code generates a key locally. No unexpected credentials or unrelated service tokens are required.
Persistence & Privilege: okFlags show always:false and user-invocable:true (normal). The skill does not request persistent platform-wide privileges or modify other skills. It writes local models/config/logs under the project directory (normal for this type of tool).