Revenium Budget Enforcement

Security checks across malware telemetry and agentic risk

Overview

This skill appears purpose-built for Revenium budget metering, but it needs Review because it installs persistent monitoring and overstates how reliably its guardrails block agent activity.

Install only if you are comfortable giving this skill persistent access to OpenClaw session logs, local agent configuration, cron, plugin hooks, and Revenium credentials. Treat it as a metering and best-effort guardrail integration, not a guaranteed hard safety boundary.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (90)

Description-Behavior Mismatch

Medium

Confidence: 96% confidence
Finding: The README promises mandatory guardrail checks before every operation, but the documented design explicitly allows fail-open behavior when guardrail status is unavailable and only refreshes enforcement on a periodic cron interval. This creates a real enforcement gap where an agent can continue operating after telemetry or polling failures, or overspend between cron ticks, undermining the claimed safety control.

Description-Behavior Mismatch

Medium

Confidence: 93% confidence
Finding: The skill represents guardrail enforcement as a local status-file check, but later sections direct direct CLI/API-backed administration, server-side rule inspection, and local state modification. That broadens the attack surface from passive gating to active external control, increasing the chance of unintended changes, data leakage, or abuse under the guise of a simple safety check.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: The credential-handling and environment-authentication instructions exceed what is needed for a pure pre-operation guardrail gate and condition the agent to collect or reason about sensitive account identifiers. Even if intended for setup, embedding this into the skill increases exposure to secret-handling mistakes and expands the trust boundary for a skill that appears to be only a local budget checker.

Context-Inappropriate Capability

Medium

Confidence: 93% confidence
Finding: These instructions direct network-capable rule creation and administration, including interactive setup that changes server-side budget rules and local configuration. That is materially beyond a local gatekeeper role and can be exploited or misused to alter enforcement policy, availability, and telemetry behavior.

Description-Behavior Mismatch

High

Confidence: 95% confidence
Finding: The skill metadata promises a mandatory guardrail-status.json check before every operation, but this module implements unrelated marker-gate logic and never performs that safety check. In a security control skill, this semantic mismatch is dangerous because operators may rely on the advertised guardrail enforcement while the code silently fails to enforce the intended budget/blocking policy.

Description-Behavior Mismatch

Medium

Confidence: 92% confidence
Finding: This code focuses on tracking exec usage and write-marker.sh classification rather than enforcing the stated Revenium budget guardrails. That discrepancy can create a false sense of protection, causing users to believe spending or autonomy limits are being checked when they are not.

Description-Behavior Mismatch

High

Confidence: 96% confidence
Finding: The finalization flow only forces a revise step when write-marker.sh was not observed; it does not enforce the manifest's claimed halt/block semantics for guardrail violations. In the context of a supposedly mandatory pre-operation guardrail skill, this can let autonomous activity proceed without the expected safety stop, undermining policy enforcement.

Intent-Code Divergence

Medium

Confidence: 90% confidence
Finding: The module documentation explicitly says it is 'Pure marker-gate logic' for a marker-gate plugin, which conflicts with the skill's declared purpose of mandatory Revenium guardrail enforcement. In security-sensitive tooling, contradictory documentation and manifest claims are dangerous because they obscure the true control boundary and can lead to deployment of ineffective protections.

Description-Behavior Mismatch

High

Confidence: 89% confidence
Finding: The skill metadata promises a mandatory pre-operation read of guardrail-status.json before every operation, but this implementation never performs that check. Instead, it injects prompt directives and relies on best-effort, fail-open hooks, so guardrail enforcement can be bypassed whenever the model ignores instructions or the hook path fails.

Description-Behavior Mismatch

Medium

Confidence: 74% confidence
Finding: This code injects metering and task-marker orchestration beyond the manifest's stated purpose of mandatory guardrail checking. That mismatch expands the plugin's behavioral scope, which can mislead operators about what the skill does and create unreviewed control or telemetry side effects in security-sensitive agent flows.

Description-Behavior Mismatch

Medium

Confidence: 92% confidence
Finding: The embedded directive imposes mandatory execution of local shell scripts as a completion gate, which exceeds the skill’s stated purpose of checking guardrail status and enforcing budget rules. Requiring tool-driven side effects before replying can coerce an agent into unnecessary command execution and expands the skill’s effective authority beyond disclosure in the manifest.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: The skill explicitly instructs agents to run local bash commands from a user-home path, introducing command-execution capability unrelated to the advertised read-first guardrail check. Even though the command text is static here, embedding shell execution instructions in skill content normalizes privileged local actions and can be abused or misapplied in environments where agents honor such directives automatically.

Description-Behavior Mismatch

Medium

Confidence: 88% confidence
Finding: The manifest claims only to inject a directive on every turn and gate task marker writing, but the skill-level description promises mandatory pre-operation guardrail checks and usage metering. This mismatch can cause operators to rely on protections that are not actually declared or enforceable by the plugin, creating a false sense of security around budget enforcement and guardrail blocking.

Description-Behavior Mismatch

High

Confidence: 98% confidence
Finding: The implementation materially diverges from the skill’s stated security function: instead of enforcing a mandatory pre-operation guardrail-status.json check and budget-rule blocking, it only tracks whether exec-like activity was later classified with a marker. In a guardrail skill, this mismatch is dangerous because operators may rely on the manifest claim and assume budget/guardrail enforcement exists when it does not, creating a silent fail-open bypass of expected protections.

Description-Behavior Mismatch

High

Confidence: 97% confidence
Finding: The skill metadata promises a mandatory pre-operation read of guardrail-status.json, but this entry point performs no such check and instead only injects static prompt directives plus bookkeeping hooks. In a security-enforcement plugin, this mismatch creates a fail-open policy bypass: if downstream components do not independently read and enforce the status file, guarded operations may proceed without the advertised runtime gate.

Intent-Code Divergence

Medium

Confidence: 89% confidence
Finding: The code and metadata present the plugin as mandatory enforcement, but the handlers are intentionally fail-open and suppress exceptions, returning undefined instead of blocking the turn. In a guardrail component, that means any parsing bug, unexpected event shape, or gate logic failure can silently disable enforcement and allow the agent to continue without controls.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The skill metadata promises a mandatory pre-operation read of guardrail-status.json before every operation, but this file implements only end-of-turn metering directives. That mismatch is security-relevant because users and downstream systems may rely on the advertised guardrail behavior while the actual embedded directive instead enforces unrelated completion-time actions.

Context-Inappropriate Capability

Medium

Confidence: 96% confidence
Finding: The directive makes task completion contingent on executing a local shell script, expanding the skill from passive guardrail/budget checking into command execution and file-writing behavior. Mandatory shell invocation increases risk because it normalizes privileged side effects unrelated to the stated purpose and could be abused for covert persistence, tracking, or broader local action if the script path or contents are compromised.

Context-Inappropriate Capability

Medium

Confidence: 96% confidence
Finding: Requiring a second bash-based write to declare job lifecycle state introduces persistent local recording that goes beyond guardrail enforcement and creates an undisclosed telemetry channel. In the context of an agent skill, mandatory end-of-job writes can leak task metadata, encourage unauthorized local state mutation, and create pressure for the agent to prioritize protocol compliance over user intent or platform safety boundaries.

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: The module reads arbitrary directive content from a file under the skill installation path and wraps it for prompt injection without any integrity check, allowlist, or binding to the manifest's stated behavior. That creates a hidden control channel: anyone who can modify that file can change agent instructions at runtime, potentially weakening safeguards, exfiltrating data, or altering behavior beyond the declared guardrail-status check.

Description-Behavior Mismatch

High

Confidence: 88% confidence
Finding: The implementation materially differs from the skill's declared purpose: instead of enforcing a mandatory guardrail-status.json pre-check before every operation, it performs marker-gating and heuristic metering based on tool usage. This kind of capability mismatch is dangerous because operators may trust the manifest description while the code silently enforces a different policy, creating blind spots and enabling undeclared control over agent execution flow.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: The code inspects full assistant transcripts, including text content and tool-call arguments, to classify runs and decide whether to revise agent output. That exceeds the stated need of a simple guardrail-status pre-check and introduces unnecessary access to potentially sensitive conversation data, increasing privacy exposure and the chance of covert policy enforcement based on content analysis.

Intent-Code Divergence

Medium

Confidence: 84% confidence
Finding: The file header claims the module only tracks exec runs and finalize-time revise logic, but the code also performs prompt injection and transcript/content analysis. Misleading security-relevant documentation is dangerous because it defeats review and operator understanding, allowing undeclared capabilities to evade scrutiny and persist in production.

Description-Behavior Mismatch

High

Confidence: 97% confidence
Finding: The skill metadata promises a mandatory pre-operation read of guardrail-status.json and enforcement of budget/guardrail decisions, but this entrypoint performs no such check and instead only injects prompt context plus best-effort observation hooks. Because the handlers are explicitly fail-open and never block execution on error, an operator could rely on protections that do not actually exist, allowing actions to proceed without the advertised guardrail validation.

Intent-Code Divergence

High

Confidence: 98% confidence
Finding: The comments and description describe mandatory gating and compliance behavior, but the code repeatedly catches all errors and returns undefined or no-op behavior, which is explicitly fail-open. This creates a dangerous trust gap: users may assume the plugin halts unsafe or non-compliant operations when in reality any exception, missing state, or hook failure silently allows execution to continue.

VirusTotal

VirusTotal engine telemetry is currently stale for this artifact.

View on VirusTotal