Back to skill
Skillv0.1.0

ClawScan security

LLMs.txt Generator · ClawHub's context-aware review of the artifact, metadata, and declared behavior.

Scanner verdict

SuspiciousFeb 28, 2026, 7:12 AM
Verdict
suspicious
Confidence
medium
Model
gpt-5-mini
Summary
The skill largely does what it says (crawls a site and builds llms.txt), but there are incongruities around how it's executed and its dependencies that deserve review before running.
Guidance
This skill appears to implement the described crawler and llms.txt generation, but before running it you should: (1) review the crawl.py source yourself (it only issues HTTP GETs and parses HTML, but it extracts emails and page text), (2) note that dependencies (httpx, beautifulsoup4, lxml) are required but not installed by the registry — either run it in a controlled virtualenv or provide the packages, (3) the SKILL.md hardcodes a virtualenv/workspace path that may not exist — adjust the invocation to your environment, (4) avoid asking it to crawl sensitive internal URLs unless you trust the environment (the crawler will fetch any URL you give it), and (5) consider running the skill in a sandboxed environment or with restricted network access until you're comfortable with its behavior.

Review Dimensions

Purpose & Capability
noteName/description match the included code: scripts/crawl.py implements a 2-level crawler and extraction heuristics consistent with generating an llms.txt. However, SKILL.md hardcodes a Python virtualenv path (~/.virtualenvs/llms-txt-generator/bin/python3) and a workspace path (~/.openclaw/workspace/llms-txt-generator/scripts/crawl.py) even though the skill declares no required binaries or install steps — this mismatch is unexpected.
Instruction Scope
noteInstructions restrict actions to crawling the user-provided site and re-crawling extra URLs, producing /tmp/llms_business_info.json and conversational gap-filling. The crawler extracts emails and raw page text (including up to 8000 chars in deep mode). This is within the stated purpose, but extracting emails/raw text is sensitive and the skill will fetch any URLs the user (or agent) supplies, which could reach internal endpoints if given.
Install Mechanism
concernThere is no install spec despite the code requiring Python packages (httpx, beautifulsoup4, lxml). The SKILL.md invokes a specific virtualenv path that is not provisioned by the registry metadata. That mismatch means the runtime may fail or an operator might create the virtualenv themselves (with attendant trust concerns). No external downloads or obscure URLs are used in the code, which is good, but dependency handling is underspecified.
Credentials
noteThe skill requests no environment variables or credentials, which aligns with its stated purpose. It does extract email addresses and other public content from crawled pages; including emails in generated llms.txt is consistent with the referenced spec, but users should be aware public email addresses found by the crawler will be surfaced in output.
Persistence & Privilege
okalways is false and the skill doesn't request persistent system-wide privileges. It writes to /tmp/llms_business_info.json (transient) and reads/writes only its own workspace/script — no evidence it alters other skills or global config.