Back to skill
Skillv1.0.0
ClawScan security
DOCX Toolkit · ClawHub's context-aware review of the artifact, metadata, and declared behavior.
Scanner verdict
BenignMar 5, 2026, 7:16 AM
- Verdict
- benign
- Confidence
- high
- Model
- gpt-5-mini
- Summary
- The skill's code, instructions, and dependency list are coherent with its stated purpose (extracting text, tables, and images from .docx/.doc); no unexpected network access or credential requests were found.
- Guidance
- This skill appears to do what it claims: local extraction of text, tables, and images from Word files. Before using on sensitive content, consider: run it on a sandbox or isolated environment for untrusted documents; expect the scripts to write files to the specified output_dir and note that resize_images overwrites in-place by default; very large legacy .doc files may use a lot of RAM; image extraction can pull out sensitive items (IDs, certificates)—review outputs before uploading anywhere; classification is heuristic and language-specific (may mislabel). No network exfiltration or secret usage was observed in the code. If you require stronger assurance, inspect the bundled scripts locally or run them in a container.
Review Dimensions
- Purpose & Capability
- okName/description match the included scripts: extract_text.py, extract_doc_text.py, extract_images.py, and resize_images.py. Declared Python libraries (python-docx, olefile, Pillow) are appropriate for the stated functionality. No unrelated binaries, env vars, or external services are requested.
- Instruction Scope
- noteSKILL.md only instructs running the included scripts on local files and directories. The scripts read input document files, write extracted text/images to an output directory, and optionally write a JSON manifest. This is within scope. Notes: extract_doc_text reads raw OLE streams and may use significant RAM for very large .doc files; resize_images will overwrite files if output_dir is omitted; classify_by_context uses heuristic keyword matching (mostly Chinese keywords) and can misclassify. The scripts do not contact external endpoints or read environment variables.
- Install Mechanism
- okNo install spec is provided (instruction-only), and the code is bundled with the skill. Dependencies are normal Python packages installable via pip. No downloads from arbitrary URLs or archive extraction are present.
- Credentials
- okThe skill requests no environment variables, credentials, or special config paths. All required resources are local files and standard Python packages, which is proportionate to the functionality.
- Persistence & Privilege
- okThe skill is not always-enabled and does not request persistent or elevated platform privileges. It does not alter other skills' configuration or require platform-wide settings.
