Skillv2.0.16

ClawScan security

PaddleOCR Document Parsing · ClawHub's context-aware review of the artifact, metadata, and declared behavior.

Scanner verdict

BenignApr 3, 2026, 4:26 AM

Verdict: benign
Confidence: high
Model: gpt-5-mini
Summary: The skill's files, instructions, and required credentials are coherent with a PaddleOCR document-parsing wrapper: it needs an API URL and access token, uses the uv runner and standard Python libraries, and its behavior matches the description.
Guidance: This skill appears to do what it claims, but consider these practical precautions before installing or running it: (1) You must provide a PADDLEOCR_DOC_PARSING_API_URL and PADDLEOCR_ACCESS_TOKEN — the skill will upload files (base64 for local file inputs) to that endpoint, so only use tokens/endpoints you trust. (2) By default results are auto-saved under the system temp directory (containing full extracted text and potentially sensitive data); use --stdout or a controlled --output path if you want to avoid leftover temp files. (3) The runner 'uv' will fetch dependencies (httpx, Pillow, pypdfium2) at runtime — standard PyPI supply-chain risks apply. (4) There is a minor metadata mismatch: an optional env var PADDLEOCR_DOC_PARSING_TIMEOUT is used but not declared as required; set it if you need different network timeouts. (5) If you will parse private/local files, be deliberate about using file-path (uploads content) vs file-url (service fetches URL) depending on privacy requirements.

Review Dimensions

Purpose & Capability: okName/description, scripts, and declared env vars (PADDLEOCR_DOC_PARSING_API_URL and PADDLEOCR_ACCESS_TOKEN) align: the code posts documents to a PaddleOCR layout-parsing endpoint and returns structured JSON/Markdown. Required binary 'uv' is used to run the scripts and is appropriate for this packaging model.
Instruction Scope: noteRuntime instructions focus on invoking the included CLI scripts and only reference the declared env vars. The skill saves full raw JSON results to a temp directory by default and instructs the agent to read/return the complete output — this is expected for a parser but means sensitive document contents will be written to disk and may be returned to users. The scripts do not attempt to read unrelated system files or credentials.
Install Mechanism: noteThere is no explicit install spec; the skill expects 'uv' to run scripts which will automatically resolve dependencies (httpx, Pillow, pypdfium2) from package sources. This is consistent with the skill's design but implies standard supply-chain risk (dependencies are fetched at run time from registries).
Credentials: noteOnly the API URL and access token are required and the primaryEnv is appropriately the access token. One optional env var (PADDLEOCR_DOC_PARSING_TIMEOUT) is referenced in code and documentation but not listed in the required env metadata — minor mismatch but not malicious. No unrelated credentials or broad system secrets are requested.
Persistence & Privilege: okThe skill does not request permanent inclusion (always=false) and does not modify other skills or system-wide settings. It writes result JSON into the skill-scoped temp path by default; this is expected but worth noting for privacy reasons.