OC Guard

Security checks across malware telemetry and agentic risk

Overview

This skill is mostly coherent, but it can make high-impact OpenClaw configuration changes from AI-generated proposals and run live post-apply checks with limited enforced user review.

Review the generated plan before applying, avoid putting secrets in requirements or proposals, expect gateway restarts and live agent canary messages during apply, and use a private trusted backup directory. The package metadata says this slug is deprecated in favor of oc-guard-skill, so prefer the maintained release if available.

SkillSpector

By NVIDIA

Vulnerability Patterns

Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Taint TrackingDirect Taint Flow, Variable-Mediated Taint Flow, Credential Exfiltration Chain
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (11)

Tainted flow: 'backup_file' from os.environ.get (line 773, credential/environment) → shutil.copy2 (file write)

Medium

Category: Data Flow
Content: ts = dt.datetime.now().strftime("%Y%m%d_%H%M%S") backup_file = BACKUP_DIR / f"openclaw.json.{ts}.bak" dump_json(LAST_PROPOSAL_PATH, proposal) shutil.copy2(config_file, backup_file) dump_json(config_file, modified) log(f"Backup created: {backup_file}")
Confidence: 90% confidence
Finding: shutil.copy2(config_file, backup_file)

Tainted flow: 'payload' from os.environ.get (line 591, credential/environment) → pathlib.Path.write_text (file write)

Medium

Category: Data Flow
Content: "stdout": stdout, "stderr": stderr, } OPENCODE_DEBUG_PATH.write_text( json.dumps(payload, ensure_ascii=False, indent=2) + "\n", encoding="utf-8", )
Confidence: 86% confidence
Finding: OPENCODE_DEBUG_PATH.write_text( json.dumps(payload, ensure_ascii=False, indent=2) + "\n", encoding="utf-8", )

Tainted flow: 'backup_file' from os.environ.get (line 773, credential/environment) → shutil.copy2 (file write)

Medium

Category: Data Flow
Content: restarted = run([str(OPENCLAW_BIN), "gateway", "restart"], timeout=60, check=False) if restarted.returncode != 0: log("Gateway restart failed, rolling back") shutil.copy2(backup_file, config_file) run([str(OPENCLAW_BIN), "gateway", "restart"], timeout=60, check=False) fail("restart failed after apply; rolled back")
Confidence: 89% confidence
Finding: shutil.copy2(backup_file, config_file)

Tainted flow: 'backup_file' from os.environ.get (line 773, credential/environment) → shutil.copy2 (file write)

Medium

Category: Data Flow
Content: ok, status_text = check_gateway_running() if not ok: log("Gateway unhealthy after apply, rolling back") shutil.copy2(backup_file, config_file) run([str(OPENCLAW_BIN), "gateway", "restart"], timeout=60, check=False) fail("gateway unhealthy after apply; rolled back")
Confidence: 89% confidence
Finding: shutil.copy2(backup_file, config_file)

Tainted flow: 'payload' from os.environ.get (line 609, credential/environment) → pathlib.Path.write_text (file write)

Medium

Category: Data Flow
Content: "extracted_json": text, "parse_error": str(e), } OPENCODE_DEBUG_PATH.write_text( json.dumps(payload, ensure_ascii=False, indent=2) + "\n", encoding="utf-8", )
Confidence: 86% confidence
Finding: OPENCODE_DEBUG_PATH.write_text( json.dumps(payload, ensure_ascii=False, indent=2) + "\n", encoding="utf-8", )

Tainted flow: 'backup_file' from os.environ.get (line 773, credential/environment) → shutil.copy2 (file write)

Medium

Category: Data Flow
Content: canary_notes = run_post_apply_canary(modified) except Exception as e: log(f"Post-apply canary failed, rolling back: {e}") shutil.copy2(backup_file, config_file) run([str(OPENCLAW_BIN), "gateway", "restart"], timeout=60, check=False) fail(f"post-apply canary failed; rolled back: {e}")
Confidence: 89% confidence
Finding: shutil.copy2(backup_file, config_file)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 94% confidence
Finding: The skill explicitly encourages invoking local scripts and CLIs that can read environment state, access files, and execute shell commands, yet it declares no corresponding permissions. This creates a capability/permission mismatch that can bypass user expectations and platform safety controls, especially because the workflow is designed to modify OpenClaw configuration and could touch sensitive files such as ~/.openclaw/openclaw.json.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: The skill delegates configuration proposal generation to an external AI tool using raw natural-language input, then applies the returned JSON after only structural/path validation. This creates an indirect prompt-injection and unsafe-automation risk: malicious requirements can manipulate the model into proposing dangerous but policy-allowed changes to channels, bindings, models, or tools, leading to misrouting, privilege expansion, or credential reconfiguration.

Context-Inappropriate Capability

Medium

Confidence: 83% confidence
Finding: The post-apply canary sends live messages to configured agents, which may trigger external model usage, tool execution, network access, billing, or side effects depending on the agent's configuration. Even though the probe text is simple, executing live agents as part of config application expands the operational blast radius beyond mere validation.

Description-Behavior Mismatch

Medium

Confidence: 88% confidence
Finding: The skill is framed as config planning/apply, but apply also restarts the gateway and drives live agent execution. This hidden operational behavior raises security risk because a user may grant permission for config editing without realizing the script will also cause service interruption, external connections, and runtime actions.

Natural-Language Policy Violations

Medium

Confidence: 85% confidence
Finding: The instruction to always return a specific receipt format including the Chinese string `【模型说明-未执行】` forces a language/locale convention without user opt-in. While not directly enabling code execution, it can mislead users, reduce clarity, and create unsafe UX in security-sensitive workflows by prioritizing rigid output formatting over user comprehension and accurate communication.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal