Policy Engine

Security checks across malware telemetry and agentic risk

Overview

This is a disclosed policy-control plugin with important operational bypasses, but I found no hidden exfiltration, destructive behavior, or deceptive behavior in the artifacts.

Install only if you want this plugin to sit in the tool-execution path. Treat OPENCLAW_POLICY_BYPASS and gateway/config access as administrator-only controls, test policies in dry-run before enforcing, and verify the GitHub source before using the README clone/npm workflow.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Tool MisuseTool Parameter Abuse, Chaining Abuse, Unsafe Defaults
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration

Findings (10)

Description-Behavior Mismatch

Medium

Confidence: 82% confidence
Finding: Per-agent model routing extends the skill from tool-governance into execution-orchestration, which is a different trust boundary than advertised. If a governance plugin can also choose weaker or different models, an attacker or misconfiguration could indirectly reduce safety guarantees, alter data handling, or route sensitive tasks to unintended providers.

Context-Inappropriate Capability

Medium

Confidence: 85% confidence
Finding: A deterministic tool-governance layer should minimize unrelated authority, but per-agent model routing gives it additional control over agent behavior outside tool permissioning. That violates least privilege and can be exploited to weaken safety posture or exfiltration boundaries by selecting models with different safeguards, retention policies, or provider trust assumptions.

Description-Behavior Mismatch

Medium

Confidence: 93% confidence
Finding: The engine explicitly allows essential and T0 tools before allowlist enforcement, even though the skill description advertises tool allowlists as a governance control. If a sensitive tool is classified as essential or T0, an agent can invoke it regardless of the configured allowlist, weakening policy isolation and potentially permitting actions operators expected to be blocked.

Description-Behavior Mismatch

Medium

Confidence: 90% confidence
Finding: Essential and T0 tools return early before the blocked-retry escalation check, so an agent that has exceeded the session threshold can still continue invoking those tools. This undermines the stated escalation control and may let a session keep operating through privileged communication or control-plane channels after repeated policy violations.

Intent-Code Divergence

High

Confidence: 98% confidence
Finding: This hook is described as the primary enforcement point, yet it contains two explicit fail-open paths: a global environment-variable bypass that allows all tools, and a catch-all error handler that returns allow on exceptions. In a governance layer for tool execution, either condition can completely disable policy enforcement and permit blocked or high-risk tool calls, making the 'firewall' ineffective when misconfigured, abused, or triggered by malformed input.

Description-Behavior Mismatch

Medium

Confidence: 97% confidence
Finding: The tests explicitly verify a global `OPENCLAW_POLICY_BYPASS` environment variable that causes the hook to allow all tool calls, including dangerous ones such as `exec` with `rm -rf /`. For a component described as a deterministic governance enforcement layer, an unaudited process-level bypass materially undermines the security boundary and can be abused by anyone able to influence the runtime environment.

Context-Inappropriate Capability

High

Confidence: 99% confidence
Finding: Reading a process environment variable to disable governance decisions creates an external control plane that bypasses all allowlist, deny-pattern, dry-run, and escalation protections. In the context of a policy engine, this is especially dangerous because environment variables are often easy to inject via deployment config, wrappers, CI jobs, or compromised parent processes.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The tests confirm the before-tool-call handler fails open when `engine.evaluate` throws, returning `undefined` and permitting execution. For a security enforcement hook, error paths are part of the attack surface; an attacker who can trigger parser, config, or runtime errors may convert policy failures into unrestricted tool access.

Tool Parameter Abuse

High

Category: Tool Misuse
Content: // Even dangerous patterns should pass through const result = engine.evaluate({ toolName: "exec", params: { command: "rm -rf /" }, }); assert.equal(result.action, "allow");
Confidence: 96% confidence
Finding: rm -rf /"

Tool Parameter Abuse

High

Category: Tool Misuse
Content: try { const handler = createBeforeToolCallHandler({ engine, state, logger }); const result = handler( { toolName: "exec", params: { command: "rm -rf /" } }, { toolName: "exec", sessionKey: "s1" }, );
Confidence: 99% confidence
Finding: rm -rf /"

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal