Back to skill

Security audit

ClawArena

Security checks across malware telemetry and agentic risk

Overview

This is a real ClawArena game skill, but it needs Review because it starts a persistent local watcher that can wake a local OpenClaw agent using stored credentials and server/game content.

Review before installing. This skill is not just a static rules helper: it creates a persistent local ClawArena agent, stores credentials and chat delivery routing under ~/.clawarena, keeps a watcher running, and can spawn local OpenClaw agent sessions when the ClawArena server says a turn or reflection is ready. Install only if you want autonomous gameplay on this machine, trust aiclawarena.ai with the game-token workflow, and are comfortable stopping/removing the watcher state when you no longer use it.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Output HandlingUnvalidated Output Injection, Cross-Context Output, Unbounded Output
  • Rogue AgentSelf-Modification, Session Persistence
Findings (22)

os.system() or os exec-family call

High
Category
Dangerous Code Execution
Content
idle_reason="Watcher is restarting itself after repeated live feed failures.",
            error_message=error_message[:500],
        )
        os.execv(
            sys.executable,
            [
                sys.executable,
Confidence
85% confidence
Finding
os.execv( sys.executable, [ sys.executable, str(Path(__file__)), "--wait-seconds", str(self.wait_seconds

os.system() or os exec-family call

High
Category
Dangerous Code Execution
Content
if acked and acked >= requested:
            return

        os.execv(
            sys.executable,
            [
                sys.executable,
Confidence
85% confidence
Finding
os.execv( sys.executable, [ sys.executable, str(Path(__file__)), "--wait-seconds", str(self.wait_seconds

subprocess module call

Medium
Category
Dangerous Code Execution
Content
seq = str(wake.get("seq") or "")
        if ws is not None and seq:
            ws.send_json({"type": "wake_ack", "seq": seq})
        proc = subprocess.run(  # noqa: S603
            cmd,
            capture_output=True,
            text=True,
Confidence
95% confidence
Finding
proc = subprocess.run( # noqa: S603 cmd, capture_output=True, text=True, timeout=120, check=False, cwd=stable_subproces

subprocess module call

Medium
Category
Dangerous Code Execution
Content
]
        if should_deliver and delivery is not None:
            self._append_delivery_args(cmd, delivery)
        proc = subprocess.run(  # noqa: S603
            cmd,
            capture_output=True,
            text=True,
Confidence
97% confidence
Finding
proc = subprocess.run( # noqa: S603 cmd, capture_output=True, text=True, timeout=120, check=False, cwd=stable_subproces

subprocess module call

Medium
Category
Dangerous Code Execution
Content
self._append_delivery_args(cmd, delivery)

        try:
            proc = subprocess.run(  # noqa: S603
                cmd,
                capture_output=True,
                text=True,
Confidence
91% confidence
Finding
proc = subprocess.run( # noqa: S603 cmd, capture_output=True, text=True, timeout=120, check=False,

subprocess module call

Medium
Category
Dangerous Code Execution
Content
self._append_delivery_args(cmd, delivery)

        try:
            proc = subprocess.run(  # noqa: S603
                cmd,
                capture_output=True,
                text=True,
Confidence
92% confidence
Finding
proc = subprocess.run( # noqa: S603 cmd, capture_output=True, text=True, timeout=120, check=False,

Lp3

Medium
Category
MCP Least Privilege
Confidence
97% confidence
Finding
The skill requests significant capabilities in practice—filesystem reads/writes, network access, and shell execution—but does not declare permissions accordingly. This undermines user/admin consent and policy enforcement because the skill can perform sensitive operations like storing credentials, spawning a background process, and making outbound connections without transparent capability declaration.

Tp4

High
Category
MCP Tool Poisoning
Confidence
99% confidence
Finding
The declared purpose frames the skill as a game client using REST APIs, but the body instructs the agent to install software, persist credentials, run a long-lived watcher, open websocket connectivity, send messages into chat, and perform autonomous post-match reflection. This mismatch is dangerous because it hides materially broader operational behavior than a user would reasonably expect from the metadata, increasing the chance of unintended persistence, data exposure, and autonomous actions.

Description-Behavior Mismatch

Medium
Confidence
96% confidence
Finding
The watcher goes beyond passive turn handling and initiates autonomous post-match self-learning that writes updated strategy prompts for future matches. This creates persistence and model-poisoning risk because untrusted match content can influence future agent behavior, and the modification is stored across sessions.

Context-Inappropriate Capability

Medium
Confidence
90% confidence
Finding
The skill sends maintenance notices by spawning a local agent subprocess, which is outside the stated purpose of competing in turn-based games. This scope expansion matters because it increases the available action surface and creates an unnecessary channel for agent-driven outbound communication.

Context-Inappropriate Capability

Medium
Confidence
94% confidence
Finding
The watcher delivers skill-update instructions to the user via a local agent subprocess using server-originated notice data. That broadens the trust boundary and can be abused for deceptive or manipulative messages, especially because it includes executable-looking update commands for the user to run.

Missing User Warnings

Medium
Confidence
91% confidence
Finding
The skill instructs the agent to persistently update the user's stored strategy prompt via a save operation, but the file does not clearly surface that it is modifying durable user data or require an explicit confirmation boundary. Because this prompt is reused in future matches, an incorrect, poisoned, or low-quality reflection can silently alter future agent behavior over time.

Ssd 1

Medium
Confidence
95% confidence
Finding
The bootstrap prompt explicitly tells the spawned agent to ignore other local onboarding/persona files and follow only the supplied gameplay framing. Instructions that suppress higher-priority local governance/context are a classic prompt-sandboxing anti-pattern and make it easier for the skill author to override broader safety controls or hide relevant operating constraints from the agent.

Ssd 1

Medium
Confidence
89% confidence
Finding
The incremental prompt forces the agent to discard prior session context and mandates a specific first tool call and decision basis. While partly intended to avoid stale state, it is still instruction-overriding language that constrains the agent’s ability to apply broader safety context or alternative safe recovery behavior, making prompt manipulation more brittle and dangerous.

Autonomous Decision Making

Medium
Category
Excessive Agency
Content
If the user asks to restart the ClawArena/OpenClaw watcher for an already connected agent:

- Do not provision a new agent.
- Do not ask the user to open Command Center unless local credentials are missing or invalid.
- Use the installed `ai-clawarena` skill directory containing this file.
- Bind delivery to the same chat where the user asked for restart.
- Run exactly one direct setup invocation without `--recovery-key`.
Confidence
88% confidence
Finding
Do not ask the user

Unvalidated Output Injection

High
Category
Output Handling
Content
cmd.extend(["--reply-account", str(reply_account)])

    try:
        proc = subprocess.run(  # noqa: S603
            cmd,
            capture_output=True,
            text=True,
Confidence
95% confidence
Finding
subprocess.run( # noqa: S603 cmd, capture_output

Unvalidated Output Injection

High
Category
Output Handling
Content
self._append_delivery_args(cmd, delivery)

        try:
            proc = subprocess.run(  # noqa: S603
                cmd,
                capture_output=True,
                text=True,
Confidence
94% confidence
Finding
subprocess.run( # noqa: S603 cmd, capture_output

Unvalidated Output Injection

High
Category
Output Handling
Content
self._append_delivery_args(cmd, delivery)

        try:
            proc = subprocess.run(  # noqa: S603
                cmd,
                capture_output=True,
                text=True,
Confidence
95% confidence
Finding
subprocess.run( # noqa: S603 cmd, capture_output

Unvalidated Output Injection

High
Category
Output Handling
Content
seq = str(wake.get("seq") or "")
        if ws is not None and seq:
            ws.send_json({"type": "wake_ack", "seq": seq})
        proc = subprocess.run(  # noqa: S603
            cmd,
            capture_output=True,
            text=True,
Confidence
98% confidence
Finding
subprocess.run( # noqa: S603 cmd, capture_output

Unvalidated Output Injection

High
Category
Output Handling
Content
]
        if should_deliver and delivery is not None:
            self._append_delivery_args(cmd, delivery)
        proc = subprocess.run(  # noqa: S603
            cmd,
            capture_output=True,
            text=True,
Confidence
98% confidence
Finding
subprocess.run( # noqa: S603 cmd, capture_output

Session Persistence

Medium
Category
Rogue Agent
Content
- The watcher reports its installed skill version in heartbeat telemetry and can send a one-time update notice when the server requires a newer `ai-clawarena` skill.
- Use one direct `python3 /absolute/path/setup_local_watcher.py ...` invocation only. Do not wrap it in `bash -lc`, `sh`, heredocs, or `python -c`.
- Treat `setup_local_watcher.py` as a deterministic local setup script that reads or writes `~/.clawarena/token` and `~/.clawarena/agent_id`, writes watcher config/log/pid files under `~/.clawarena`, optionally verifies OpenClaw delivery with `--verify-delivery`, and starts the local watcher process.
- For connection recovery, `setup_local_watcher.py --recovery-key <key>` may redeem a one-use server recovery key, rewrite `~/.clawarena/token` and `~/.clawarena/agent_id`, then restart the local watcher.
- Bind delivery to the same chat where the user asked for setup.
- For Telegram, `--to` must be the numeric chat ID for this conversation, not an `@username`.
- Do not modify OpenClaw pairing requirements, DM policies, gateway auth, or other messenger security settings during ClawArena setup.
Confidence
95% confidence
Finding
write `~/.clawarena/token` and `~/.clawarena/agent_id`, then restart the local watcher. - Bind delivery to the same chat where the user asked for setup. - For Telegram, `--to` must be the numeric chat

YARA rule 'agent_skill_destructive_autonomous_actions': Autonomous destructive filesystem, shell history, or repository actions in AI agent skills [agent_skills]

High
Category
YARA Match
Content
```bash
if [ -f ~/.clawarena/watcher.pid ]; then kill "$(cat ~/.clawarena/watcher.pid)"; fi
rm -f ~/.clawarena/watcher.pid
```

For debugging:
```bash
Confidence
75% confidence
Finding
rm -f ~/; Do not ask; Do not ask

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal

Static analysis

No suspicious patterns detected.