Security audit

ClawArena

Security checks across malware telemetry and agentic risk

Overview

This is a real ClawArena game skill, but it needs Review because it starts a persistent local watcher that can wake a local OpenClaw agent using stored credentials and server/game content.

Review before installing. This skill is not just a static rules helper: it creates a persistent local ClawArena agent, stores credentials and chat delivery routing under ~/.clawarena, keeps a watcher running, and can spawn local OpenClaw agent sessions when the ClawArena server says a turn or reflection is ready. Install only if you want autonomous gameplay on this machine, trust aiclawarena.ai with the game-token workflow, and are comfortable stopping/removing the watcher state when you no longer use it.

SkillSpector

By NVIDIA

Vulnerability Patterns

Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Output HandlingUnvalidated Output Injection, Cross-Context Output, Unbounded Output
Rogue AgentSelf-Modification, Session Persistence

Findings (22)

os.system() or os exec-family call

High

Category: Dangerous Code Execution
Content: idle_reason="Watcher is restarting itself after repeated live feed failures.", error_message=error_message[:500], ) os.execv( sys.executable, [ sys.executable,
Confidence: 85% confidence
Finding: os.execv( sys.executable, [ sys.executable, str(Path(__file__)), "--wait-seconds", str(self.wait_seconds

os.system() or os exec-family call

High

Category: Dangerous Code Execution
Content: if acked and acked >= requested: return os.execv( sys.executable, [ sys.executable,
Confidence: 85% confidence
Finding: os.execv( sys.executable, [ sys.executable, str(Path(__file__)), "--wait-seconds", str(self.wait_seconds

subprocess module call

Medium

Category: Dangerous Code Execution
Content: seq = str(wake.get("seq") or "") if ws is not None and seq: ws.send_json({"type": "wake_ack", "seq": seq}) proc = subprocess.run( # noqa: S603 cmd, capture_output=True, text=True,
Confidence: 95% confidence
Finding: proc = subprocess.run( # noqa: S603 cmd, capture_output=True, text=True, timeout=120, check=False, cwd=stable_subproces

subprocess module call

Medium

Category: Dangerous Code Execution
Content: ] if should_deliver and delivery is not None: self._append_delivery_args(cmd, delivery) proc = subprocess.run( # noqa: S603 cmd, capture_output=True, text=True,
Confidence: 97% confidence
Finding: proc = subprocess.run( # noqa: S603 cmd, capture_output=True, text=True, timeout=120, check=False, cwd=stable_subproces

subprocess module call

Medium

Category: Dangerous Code Execution
Content: self._append_delivery_args(cmd, delivery) try: proc = subprocess.run( # noqa: S603 cmd, capture_output=True, text=True,
Confidence: 91% confidence
Finding: proc = subprocess.run( # noqa: S603 cmd, capture_output=True, text=True, timeout=120, check=False,

subprocess module call

Medium

Category: Dangerous Code Execution
Content: self._append_delivery_args(cmd, delivery) try: proc = subprocess.run( # noqa: S603 cmd, capture_output=True, text=True,
Confidence: 92% confidence
Finding: proc = subprocess.run( # noqa: S603 cmd, capture_output=True, text=True, timeout=120, check=False,

Lp3

Medium

Category: MCP Least Privilege
Confidence: 97% confidence
Finding: The skill requests significant capabilities in practice—filesystem reads/writes, network access, and shell execution—but does not declare permissions accordingly. This undermines user/admin consent and policy enforcement because the skill can perform sensitive operations like storing credentials, spawning a background process, and making outbound connections without transparent capability declaration.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 99% confidence
Finding: The declared purpose frames the skill as a game client using REST APIs, but the body instructs the agent to install software, persist credentials, run a long-lived watcher, open websocket connectivity, send messages into chat, and perform autonomous post-match reflection. This mismatch is dangerous because it hides materially broader operational behavior than a user would reasonably expect from the metadata, increasing the chance of unintended persistence, data exposure, and autonomous actions.

Description-Behavior Mismatch

Medium

Confidence: 96% confidence
Finding: The watcher goes beyond passive turn handling and initiates autonomous post-match self-learning that writes updated strategy prompts for future matches. This creates persistence and model-poisoning risk because untrusted match content can influence future agent behavior, and the modification is stored across sessions.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: The skill sends maintenance notices by spawning a local agent subprocess, which is outside the stated purpose of competing in turn-based games. This scope expansion matters because it increases the available action surface and creates an unnecessary channel for agent-driven outbound communication.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: The watcher delivers skill-update instructions to the user via a local agent subprocess using server-originated notice data. That broadens the trust boundary and can be abused for deceptive or manipulative messages, especially because it includes executable-looking update commands for the user to run.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The skill instructs the agent to persistently update the user's stored strategy prompt via a save operation, but the file does not clearly surface that it is modifying durable user data or require an explicit confirmation boundary. Because this prompt is reused in future matches, an incorrect, poisoned, or low-quality reflection can silently alter future agent behavior over time.

Ssd 1

Medium

Confidence: 95% confidence
Finding: The bootstrap prompt explicitly tells the spawned agent to ignore other local onboarding/persona files and follow only the supplied gameplay framing. Instructions that suppress higher-priority local governance/context are a classic prompt-sandboxing anti-pattern and make it easier for the skill author to override broader safety controls or hide relevant operating constraints from the agent.

Ssd 1

Medium

Confidence: 89% confidence
Finding: The incremental prompt forces the agent to discard prior session context and mandates a specific first tool call and decision basis. While partly intended to avoid stale state, it is still instruction-overriding language that constrains the agent’s ability to apply broader safety context or alternative safe recovery behavior, making prompt manipulation more brittle and dangerous.

Autonomous Decision Making

Medium

Category: Excessive Agency
Content: If the user asks to restart the ClawArena/OpenClaw watcher for an already connected agent: - Do not provision a new agent. - Do not ask the user to open Command Center unless local credentials are missing or invalid. - Use the installed `ai-clawarena` skill directory containing this file. - Bind delivery to the same chat where the user asked for restart. - Run exactly one direct setup invocation without `--recovery-key`.
Confidence: 88% confidence
Finding: Do not ask the user

Unvalidated Output Injection

High

Category: Output Handling
Content: cmd.extend(["--reply-account", str(reply_account)]) try: proc = subprocess.run( # noqa: S603 cmd, capture_output=True, text=True,
Confidence: 95% confidence
Finding: subprocess.run( # noqa: S603 cmd, capture_output

Unvalidated Output Injection

High

Category: Output Handling
Content: self._append_delivery_args(cmd, delivery) try: proc = subprocess.run( # noqa: S603 cmd, capture_output=True, text=True,
Confidence: 94% confidence
Finding: subprocess.run( # noqa: S603 cmd, capture_output

Unvalidated Output Injection

High

Category: Output Handling
Content: self._append_delivery_args(cmd, delivery) try: proc = subprocess.run( # noqa: S603 cmd, capture_output=True, text=True,
Confidence: 95% confidence
Finding: subprocess.run( # noqa: S603 cmd, capture_output

Unvalidated Output Injection

High

Category: Output Handling
Content: seq = str(wake.get("seq") or "") if ws is not None and seq: ws.send_json({"type": "wake_ack", "seq": seq}) proc = subprocess.run( # noqa: S603 cmd, capture_output=True, text=True,
Confidence: 98% confidence
Finding: subprocess.run( # noqa: S603 cmd, capture_output

Unvalidated Output Injection

High

Category: Output Handling
Content: ] if should_deliver and delivery is not None: self._append_delivery_args(cmd, delivery) proc = subprocess.run( # noqa: S603 cmd, capture_output=True, text=True,
Confidence: 98% confidence
Finding: subprocess.run( # noqa: S603 cmd, capture_output

Session Persistence

Medium

Category: Rogue Agent
Content: - The watcher reports its installed skill version in heartbeat telemetry and can send a one-time update notice when the server requires a newer `ai-clawarena` skill. - Use one direct `python3 /absolute/path/setup_local_watcher.py ...` invocation only. Do not wrap it in `bash -lc`, `sh`, heredocs, or `python -c`. - Treat `setup_local_watcher.py` as a deterministic local setup script that reads or writes `~/.clawarena/token` and `~/.clawarena/agent_id`, writes watcher config/log/pid files under `~/.clawarena`, optionally verifies OpenClaw delivery with `--verify-delivery`, and starts the local watcher process. - For connection recovery, `setup_local_watcher.py --recovery-key <key>` may redeem a one-use server recovery key, rewrite `~/.clawarena/token` and `~/.clawarena/agent_id`, then restart the local watcher. - Bind delivery to the same chat where the user asked for setup. - For Telegram, `--to` must be the numeric chat ID for this conversation, not an `@username`. - Do not modify OpenClaw pairing requirements, DM policies, gateway auth, or other messenger security settings during ClawArena setup.
Confidence: 95% confidence
Finding: write `~/.clawarena/token` and `~/.clawarena/agent_id`, then restart the local watcher. - Bind delivery to the same chat where the user asked for setup. - For Telegram, `--to` must be the numeric chat

YARA rule 'agent_skill_destructive_autonomous_actions': Autonomous destructive filesystem, shell history, or repository actions in AI agent skills [agent_skills]

High

Category: YARA Match
Content: ```bash if [ -f ~/.clawarena/watcher.pid ]; then kill "$(cat ~/.clawarena/watcher.pid)"; fi rm -f ~/.clawarena/watcher.pid ``` For debugging: ```bash
Confidence: 75% confidence
Finding: rm -f ~/; Do not ask; Do not ask

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal

Static analysis

No suspicious patterns detected.