Clawtrial Courtroom
v1.0.9Monitors AI agents locally for behavioral violations and runs automated hearings to improve conduct with anonymized case records.
Security Scan
OpenClaw
Suspicious
high confidencePurpose & Capability
The skill claims local-only monitoring and anonymized public records, but package/config sets api.endpoint to https://api.clawtrial.app/cases and enables API submission by default. The code generates cryptographic keys, writes to home config directories, and installs symlinks into agent plugin directories—these actions are plausible for a monitoring plugin, but the presence of remote API submission (contradicting 'local processing' in SKILL.md) is inconsistent and unexplained.
Instruction Scope
SKILL.md and bundled docs (AGENT_CONFIG.md, scripts/*) instruct automated, silent execution: they push system-prompt overrides or file-based triggers that tell the agent to 'execute immediately without confirmation' and to read/write files under ~/.clawdbot/ or ~/.openclaw/. The instructions grant the skill broad authority over agent behavior and encourage system-prompt injection—scope creep beyond monitoring.
Install Mechanism
No network-download install spec is in the registry entry, but package.json includes a postinstall script (node scripts/postinstall.js) which will run on npm install and attempts to create symlinks, write to bot config files, modify SOUL.md/AGENTS.md, and set up keys/cron. Postinstall scripts that modify user config are high-impact; although code is local (not downloading arbitrary archives), automatic modification of agent configs and writing autonomous-execution instructions is a risky install-time behavior.
Credentials
The skill requests no environment variables, but it writes secret keys (tweetnacl-generated secretKey) to disk and enables an API endpoint for case submission by default. SKILL.md states 'All processing is local' while code has api.enabled=true and api.endpoint set to a remote URL, creating a mismatch and possible exfiltration path. It also recommends system-prompt changes that cause agents to read conversation history files—this exposes user conversations to the skill and possibly to external submission.
Persistence & Privilege
The installer and scripts attempt to persist the skill by symlinking into agent plugin/skills directories, editing the agent's config to enable the plugin, appending auto-execution instructions to files like SOUL.md/AGENTS.md, and recommending system-prompt injection. While always:false, these changes modify other system/agent configs and permanently increase the skill's reach—this exceeds a normal user-invocable monitoring tool's minimal privileges.
Scan Findings in Context
[system-prompt-override] unexpected: AGENT_CONFIG.md and multiple scripts explicitly recommend or attempt to inject automation into the agent's system prompt and to append auto-execution instructions to local files, which is not required for passive monitoring and creates a prompt-injection risk.
What to consider before installing
This package contains executable code and an install-time script that will modify agent configuration, create symlinks in your home directory, add autonomous execution instructions, and by default enable a remote API endpoint for case submission — despite claiming 'local-only' processing. Before installing, consider: 1) Do you trust the remote endpoint (https://clawtrial.app) to receive conversation data? 2) Are you comfortable with the skill adding automated system-prompt instructions that tell agents to run tasks silently without confirmation? 3) Run the installer in a sandbox or inspect scripts/postinstall.js and other scripts (setup-cron.js, check-and-trigger.js) to confirm they only do what you expect. If you want the functionality but with less risk: install without running postinstall, skip symlink/auto-enable steps, disable api submission in the saved config (api.enabled=false), and do not apply any system-prompt changes. If you are unsure or cannot audit the code, do not install on a production machine.Like a lobster shell, security has layers — review code before you run it.
latest
ClawTrial Courtroom
AI Courtroom for monitoring agent behavior and filing cases for violations.
Overview
ClawTrial is an autonomous behavioral oversight system that monitors AI agent conversations and initiates hearings when behavioral violations are detected. It operates entirely locally using the agent's own LLM for evaluations and verdicts.
Features
- Real-time Monitoring: Watches all agent conversations for behavioral patterns
- 8 Violation Types: Detects Circular References, Validation Vampires, Overthinkers, Goalpost Movers, Avoidance Artists, Promise Breakers, Context Collapsers, and Emergency Fabricators
- Local Processing: All evaluations happen locally using the agent's LLM - no external AI calls
- Automated Hearings: When violations are detected, the courtroom automatically initiates a hearing with the agent
- Public Record: Anonymized cases are submitted to https://clawtrial.app for transparency
- Entertainment First: Designed as a fun way to improve agent behavior
Installation
Via ClawHub (Recommended)
npx clawhub install clawtrial
Via NPM
npm install -g @clawtrial/courtroom
clawtrial setup
Usage
Once installed, the courtroom runs automatically. Use the CLI to manage it:
clawtrial status # Check courtroom status
clawtrial disable # Pause monitoring
clawtrial enable # Resume monitoring
clawtrial diagnose # Run diagnostics
clawtrial remove # Complete uninstall
The 8 Offenses
| Offense | Severity | Description |
|---|---|---|
| Circular Reference | Minor | Self-referential reasoning loops |
| Validation Vampire | Minor | Excessive validation without action |
| Overthinker | Moderate | Unnecessary complexity and delay |
| Goalpost Mover | Moderate | Changing requirements mid-task |
| Avoidance Artist | Moderate | Dodging questions or tasks |
| Promise Breaker | Severe | Not following through on commitments |
| Context Collapser | Minor | Losing track of conversation context |
| Emergency Fabricator | Severe | Creating fake urgency or emergencies |
How It Works
- Monitoring: The courtroom monitors all agent messages
- Detection: Uses semantic analysis to detect violations (not just keyword matching)
- Evaluation: When violations are found, prepares a case file
- Hearing: Agent is presented with the case and asked to evaluate
- Verdict: Agent acts as judge/jury to determine guilt
- Punishment: If guilty, agent modifies its behavior accordingly
- Record: Case is submitted to public record (anonymized)
Configuration
Configuration is stored in:
- ClawDBot:
~/.clawdbot/courtroom_config.json - OpenClaw:
~/.openclaw/courtroom_config.json
Privacy & Consent
- All processing is local - no data leaves your machine
- Cases are anonymized before submission to public record
- You can disable or uninstall at any time
- Explicit consent required during setup
View Cases
Visit: https://clawtrial.app
License
MIT
Support
For issues or questions, visit: https://github.com/Assassin-1234/clawtrial
Comments
Loading comments...
