Self Improving Intent Security Agent

Security checks across malware telemetry and agentic risk

Overview

This skill is a local documentation and template toolkit, with the main risk being overstated safety language rather than hidden malicious behavior.

Install only if you want local intent/audit templates and helper scripts. Do not rely on this package alone for real enforcement, rollback, anomaly detection, or self-improvement; those require separate host-runtime controls. Review any optional hook configuration before enabling it, and keep .agent logs out of shared repositories if they may contain sensitive data.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Findings (16)

Intent-Code Divergence

Medium
Confidence
88% confidence
Finding
The README says the package is not a production runtime enforcement engine, but later describes enforcement, monitoring, rollback, and safety guarantees in a way that can reasonably cause users to overestimate what the skill actually does. This kind of security overclaim is dangerous because operators may rely on nonexistent protections for high-risk workflows, resulting in unsafe deployment or reduced human oversight.

Intent-Code Divergence

Medium
Confidence
95% confidence
Finding
The skill initially frames itself as documentation-only and explicitly says it does not provide automatic runtime enforcement, but later describes automatic validation and rollback in a way that can lead users to overtrust built-in protections. This mismatch is dangerous because operators may deploy the skill assuming active safeguards exist when they may only be templates or optional scripts, creating a false sense of security around high-risk actions.

Intent-Code Divergence

Medium
Confidence
90% confidence
Finding
The documentation claims the skill keeps data local and does not transmit externally, but it also promotes hook-driven command execution, which extends behavior beyond passive local documentation and may invoke scripts with broader side effects. Even if the scripts are intended to be local, the claim is too absolute and can mislead users about the real execution surface and trust boundary.

Description-Behavior Mismatch

Medium
Confidence
94% confidence
Finding
The top-level description markets the skill as a documentation-first toolkit, but the body expands into runtime interception, hook-based validation, rollback orchestration, anomaly monitoring, and self-improving strategy mechanisms. This scope expansion can cause security reviewers and users to underestimate the operational power of the skill and enable it in environments where active control hooks were not expected.

Intent-Code Divergence

Medium
Confidence
92% confidence
Finding
The introduction initially states the package is documentation-first and does not provide runtime enforcement, but later sections describe blocking, rollback, and policy enforcement as if they are active capabilities. This inconsistency can mislead users into trusting nonexistent protections, causing unsafe deployment or overreliance on documentation/templates instead of real controls.

Description-Behavior Mismatch

Medium
Confidence
95% confidence
Finding
These sections describe integrated pillars such as real-time monitoring, automatic adoption of improvements, rollback workflows, and policy enforcement in language that reads like present-tense product behavior. In a security-oriented skill, overstating defensive automation is dangerous because operators may assume high-risk actions are being validated or blocked when the repository only provides scaffolding.

Description-Behavior Mismatch

Medium
Confidence
94% confidence
Finding
The quick example presents ALLOWED/BLOCKED decisions, auto-triggered rollback, and learning outcomes as if they occur automatically in this package. Because the skill is positioned around security and self-improvement, readers may incorrectly infer that dangerous actions will be prevented in practice, increasing the chance of misuse in sensitive environments.

Intent-Code Divergence

High
Confidence
98% confidence
Finding
The 'Safety Guarantees' section asserts hard guarantees like intent alignment, permission boundaries, reversibility, and human oversight, directly contradicting the earlier disclaimer that no production runtime engine is provided. In a security tool, false guarantees are especially risky because they can create a dangerous illusion of enforcement, leading teams to expose autonomous agents to production workloads without real safeguards.

Description-Behavior Mismatch

High
Confidence
94% confidence
Finding
The architecture document describes an autonomous execution system with validation, authorization, runtime monitoring, rollback, and self-improvement, which materially exceeds the skill's declared documentation-first scope. This kind of scope mismatch is dangerous because downstream users, reviewers, or agent runtimes may treat the skill as a higher-trust operational control system than advertised, enabling unreviewed execution-oriented adoption or permissioning.

Description-Behavior Mismatch

High
Confidence
97% confidence
Finding
This section presents concrete live-control behaviors such as authorization, safety guardrails, checkpoint creation, rollback, monitoring, and logging as if the system actively performs them. If consumers rely on these claims, they may delegate risky tasks under the false assumption that strong runtime protections exist, creating a dangerous gap between perceived and actual enforcement.

Description-Behavior Mismatch

Medium
Confidence
91% confidence
Finding
The self-improvement, pattern extraction, strategy optimization, and A/B testing material implies adaptive behavioral evolution beyond a documentation/prototyping toolkit. That is risky because self-modifying or self-optimizing language can encourage deployment of learning loops without robust safeguards, especially when paired with earlier claims of authorization and execution control.

Vague Triggers

Medium
Confidence
82% confidence
Finding
Leaving the activation mechanism unspecified creates ambiguous trigger boundaries for a security-sensitive skill centered on intent validation and autonomous workflows. In practice, ambiguous invocation scope can cause the skill to be applied too broadly, too narrowly, or in unintended contexts, weakening enforcement guarantees and increasing the chance of unsafe or unauthorized actions.

Missing User Warnings

Medium
Confidence
87% confidence
Finding
The publishing guide instructs users to configure long-lived publishing tokens but does not warn them to keep tokens out of plaintext files, screenshots, terminal history, or CI logs. In a document aimed at operational publishing workflows, omission of basic secret-handling guidance increases the chance of accidental credential exposure and subsequent unauthorized package or skill publication.

Missing User Warnings

Medium
Confidence
96% confidence
Finding
The example `echo "YOUR_CLAWHUB_TOKEN" | npx clawhub login --token` encourages passing a secret directly on the command line pipeline without warning about exposure risks. Depending on shell, environment, and CI usage, the token may be captured in shell history, copied into logs, exposed to other users through process inspection, or retained in build output, enabling credential theft and unauthorized publishing.

Vague Triggers

Medium
Confidence
88% confidence
Finding
The auto-application triggers are broad and ambiguous, covering common categories like multi-step tasks, learning opportunities, and high-risk operations without tight scoping. In practice this can cause the skill or surrounding automation to activate in many ordinary workflows, increasing logging, blocking, or intervention in contexts where it was not intentionally enabled.

Vague Triggers

High
Confidence
98% confidence
Finding
The hook configuration uses an empty matcher for UserPromptSubmit, which effectively causes the intent-capture script to run on every prompt. A universal trigger materially increases attack surface, makes the behavior hard to predict, and can interfere with unrelated tasks or be abused to force persistent interception across all interactions.

VirusTotal

62/62 vendors flagged this skill as clean.

View on VirusTotal