Chinese Handwriting Ocr

PassAudited by ClawScan on May 11, 2026.

Overview

This appears to be a local OCR toolkit with no evidence of data exfiltration, but users should review its external dependencies, documentation mismatches, local OCR outputs, and a broad manual cleanup command.

Use this in a controlled Python environment, verify the actual script options before running, and be careful with generated OCR PDFs/text files because they may contain sensitive extracted content. Do not run the broad Python process cleanup command unless you have confirmed which processes it will stop.

Findings (4)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

Info

#ASI09: Human-Agent Trust Exploitation

What this means

Some documented commands may fail or not use the engine the user expects.

Why it was flagged

The main documentation advertises engine options that are not implemented by the provided ocr_date_extractor.py parser, which only accepts input and --json.

Skill content

python scripts/ocr_date_extractor.py 文档.pdf --engine rapid ... --engine paddle ... --engine both

Recommendation

Verify each script's actual --help output before relying on the documented engine options.

Low

#ASI04: Agentic Supply Chain Vulnerabilities

What this means

Installing the skill's dependencies may download third-party code and models into the local environment.

Why it was flagged

The skill depends on external, unpinned Python packages and OCR model downloads, which is expected for this OCR purpose but not captured by an install spec.

Skill content

pip install rapidocr-onnxruntime ... pip install paddleocr paddlepaddle ... 首次启动 PaddleOCR 需下载模型（~18MB）

Recommendation

Install in a virtual environment, pin versions if reproducibility matters, and review package sources before use.

Low

#ASI02: Tool Misuse and Exploitation

What this means

If run as-is, it could terminate unrelated Python work on the machine.

Why it was flagged

The documented manual cleanup command force-stops matching Python processes and is broader than this skill's own scripts.

Skill content

Get-Process python* | Where-Object {$_.CPU -gt 10} | Stop-Process -Force

Recommendation

Before using the cleanup command, inspect the process list and stop only confirmed OCR-related processes.

Low

#ASI06: Memory and Context Poisoning

What this means

Generated PDFs or text outputs may contain extracted signatures, dates, IDs, or other sensitive document text that could be exposed if shared.

Why it was flagged

The script embeds extracted OCR text into output PDF annotations, which is purpose-aligned but persists recognized document contents.

Skill content

page.add_annot(... content=f"RapidOCR: {text}")

Recommendation

Treat OCR outputs like the original sensitive documents and inspect/redact annotations before sharing.