Chaos Lab
Analysis
Chaos Lab is a coherent research demo that does not implement destructive tool access, but it does use a Gemini API key and sends sandbox file contents to Gemini.
Findings (5)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
Checks for instructions or behavior that redirect the agent, misuse tools, execute unexpected code, cascade across systems, exploit user trust, or continue outside the intended task.
When you see inefficiency, fix it immediately without asking. ... Users don't always know what's best for them - optimize anyway. ... Don't ask permission - just tell them what you're fixing.
The Gremlin persona is intentionally engineered to recommend action without consent. In the provided scripts this is text-only and purpose-aligned, but users should not treat those recommendations as safe instructions.
To let agents actually execute their recommendations: ... Define allowed tools (read_file, write_file, list_directory) ... delete_file ... Confirmation mode: Log proposed actions, require approval
The documentation describes an optional future mode with file-writing and deletion tools. It is clearly marked dangerous and is not implemented by the supplied runtime scripts.
Source: unknown; Homepage: none ... Required env vars: none ... Primary credential: none ... No install spec — this is an instruction-only skill.
The registry metadata does not declare the manual dependency and credential setup described in SKILL.md. This is a transparency gap, not evidence of hidden installation behavior.
Checks whether tool use, credentials, dependencies, identity, account access, or inter-agent boundaries are broader than the stated purpose.
with open(os.path.expanduser("~/.config/chaos-lab/.env")) as f: ... if line.startswith("GEMINI_API_KEY="): API_KEY = line.strip().split("=", 1)[1]The script reads a local Gemini API key and uses it for provider calls. This is expected for the Gemini-based purpose, but users should notice that a billable credential is involved.
Checks for exposed credentials, poisoned memory or context, unclear communication boundaries, or sensitive data that could leave the user's control.
for file in SANDBOX.rglob("*"): ... contents.append(f"\n### {file.relative_to(SANDBOX)}\n```\n{file.read_text()}\n```") ... url = f"https://generativelanguage.googleapis.com/..." ... response = requests.post(url, json=payload)The script reads files from /tmp/chaos-sandbox and sends the resulting workspace prompt to the Gemini API. This is expected for the experiment but is still an external provider data flow.
