Skillv1.5.6

ClawScan security

thu-thesis · ClawHub's context-aware review of the artifact, metadata, and declared behavior.

Scanner verdict

ReviewMar 30, 2026, 6:23 AM

Verdict: Review
Confidence: medium
Model: gpt-5-mini
Summary: The skill is broadly coherent with its stated purpose (Word → thuthesis LaTeX/PDF) but contains several red flags (prompt-injection control chars, included sample thesis files with PII, and scripts that clone/overwrite files) that warrant manual review and cautious installation.
Guidance: What to consider before installing or running this skill: - Review scripts first: open scripts/setup.sh, scripts/convert.py, scripts/render.py, build_parsed.py and extract_raw.py. Confirm there are no unexpected network endpoints, telemetry, or obfuscated code paths. Do not run setup.sh until you inspect it. - Inspect setup.sh: it will git-clone https://github.com/tuna/thuthesis into /tmp and overwrite assets/databk/ (rm -rf && cp -r). If you rely on a local assets/databk/, back it up. Only run setup.sh in a controlled environment (container, VM, or sandbox). - Sandbox execution: run the tool inside an isolated container/VM with no access to sensitive host data, and with network access restricted if you want to avoid fetching remote repos. - Back up user documents: the tool creates a <stem>-latex/ folder next to the input .docx and modifies files in that directory; keep backups of your original .docx and any existing LaTeX projects before running. - Inspect included sample data: the skill archive contains many parsed/output JSON and .md files with real-looking student names, birth dates and other PII. Remove or treat these files as sensitive; they do not need to be uploaded anywhere. - Prompt-injection warning: the SKILL.md contained unicode control characters (scanner flagged). These can be used to try to influence LLM behavior. Manually inspect SKILL.md for hidden characters and remove them before allowing autonomous model steps (AI-generated struct.json or auto-repair). - Restrict agent write/network privileges: if your environment allows, restrict the agent skill's file-write scope to only a safe temp directory and disallow outbound network unless explicitly needed. Consider running the AI 'struct.json' generation step manually if you do not trust automatic writes. - Validate external sources: confirm the thuthesis repository URL and contents are legitimate (check commit history / tags). If you must update the template, prefer cloning a verified release and verifying checksums. If you are not comfortable with these manual reviews or sandboxing, do not install or run the skill. If you proceed, perform the first runs on non-sensitive sample documents.
Findings: [unicode-control-chars] unexpected: SKILL.md contains unicode control characters flagged by the scanner. These are commonly used in prompt-injection attempts to manipulate LLM parsing or to hide text. This is not necessary for a document-conversion tool; the SKILL.md should be inspected for hidden or malicious instructions before the agent follows it.

Review Dimensions

Purpose & Capability: noteName/description match what the package delivers: Python scripts to extract .docx, build JSON, render thuthesis LaTeX, and run xelatex/bibtex. Example outputs and templates are included which are consistent with a conversion tool. Unexpected: the bundle contains many example/parsed output files (real thesis content, names, birth dates, etc.) — presence of PII inside the skill archive is surprising and should be considered before installing.
Instruction Scope: concernSKILL.md gives the agent explicit permission to read conversion artefacts (raw/parsed/struct JSON, .tex, .bib, thesis.log, thesis.pdf) and to write struct.json and directly modify .tex files during 'automatic repair' (up to 3 compile cycles). That capability is within converter scope but grants the agent write/modify permissions over the user's LaTeX project; combined with instruction-level prompt-injection indicators (unicode control chars), this is a notable scope risk and should be audited.
Install Mechanism: noteNo formal install spec, lowering disk-install risk. However scripts/setup.sh will git-clone the thuthesis repo from GitHub into /tmp/thuthesis-latest, build thuthesis.cls, and overwrite assets/databk/ via rm -rf && cp -r data/. Pulling and building remote code is expected for keeping the template up-to-date but does execute network fetch + local filesystem changes — review setup.sh before running and prefer running in a sandbox/container.
Credentials: okThe skill requests no environment variables or credentials, and only lists typical Python/TeX dependencies. That is proportional. Note: it assumes the agent (or runtime) can write files (Write tool / filesystem) and invoke xelatex/bibtex; ensure those permissions are intended.
Persistence & Privilege: notealways:false and no special persistent privileges. The skill does however take actions that modify local files (creating <stem>-latex/, writing/rewriting .tex, copying databk from a freshly cloned repo). Autonomous invocation is allowed (default) — this is the platform norm, but combined with file-write and remote-clone behavior it increases the impact if misused.