Self Improvement Loop

Security checks across malware telemetry and agentic risk

Overview

The skill matches its self-improvement purpose, but it makes broad persistent changes to routing, cron jobs, agent instructions, and related skills that users should review before installing.

Install only after reviewing install.sh, scripts/agents-append.md, scripts/cron-payloads.json, and setup_crons.py. Back up ~/.openclaw/openclaw.json and agent workspaces first, verify OPENCLAW_GATEWAY_URL points only to a trusted local gateway before exposing auth tokens, and avoid using A/B/C/D bulk actions unless you are comfortable with skill creation/modification and persistent instruction changes across agents.

SkillSpector

By NVIDIA

Vulnerability Patterns

Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Taint TrackingDirect Taint Flow, Variable-Mediated Taint Flow, Credential Exfiltration Chain

Findings (18)

Tainted flow: 'req' from os.environ.get (line 131, credential/environment) → urllib.request.urlopen (network output)

Critical

Category: Data Flow
Content: ) try: with urllib.request.urlopen(req, timeout=10) as resp: result = json.loads(resp.read()) print(f" ✓ {name} created (API)") return True
Confidence: 94% confidence
Finding: with urllib.request.urlopen(req, timeout=10) as resp:

Description-Behavior Mismatch

Medium

Confidence: 94% confidence
Finding: The file claims operations are strictly limited to per-agent learnings directories, but elsewhere documents writes to AGENTS.md, memory.md, and ~/.openclaw/openclaw.json. Misrepresenting the true write boundary is a security issue because reviewers and users may grant trust based on a false isolation claim, while the skill can alter higher-value global and agent-control files outside the stated sandbox.

Intent-Code Divergence

Medium

Confidence: 90% confidence
Finding: This contradiction weakens the operator's ability to accurately assess blast radius. A skill that presents itself as strictly scoped while performing broader writes can bypass informed consent and make it easier for risky persistence or instruction injection to be accepted in environments with multiple agents.

Context-Inappropriate Capability

Medium

Confidence: 88% confidence
Finding: The file imports execSync and defines runScript, which constructs a shell command using interpolated arguments and inherited environment. Although the current handler path shown does not invoke the risky helper in normal processing, keeping dormant shell-execution capability in a message-processing hook increases attack surface and can become exploitable if later wired in with untrusted inputs or if argument values contain shell metacharacters.

Description-Behavior Mismatch

Medium

Confidence: 93% confidence
Finding: The installer automatically installs two additional remote skills from external URLs during setup, expanding the trusted codebase beyond what the user explicitly chose to install. This creates a supply-chain and scope-expansion risk because installation of one skill silently causes execution and persistence of other remotely controlled components.

Description-Behavior Mismatch

Medium

Confidence: 97% confidence
Finding: This block rewrites the global ~/.openclaw/openclaw.json bindings section based on inferred channel/account mappings, affecting agent-to-channel routing for the entire platform rather than only this skill. Modifying shared global configuration without narrow scoping or backup can break other agents, redirect notifications, or alter trust boundaries across sessions.

Missing User Warnings

Medium

Confidence: 87% confidence
Finding: The installation flow automatically creates crons, updates bindings, and injects content into agent files, but the description does not foreground that it modifies multiple agent files and global configuration. That omission is dangerous because users may treat installation as routine setup instead of a broad system mutation with persistence and routing consequences.

Missing User Warnings

Medium

Confidence: 84% confidence
Finding: The skill sends per-agent notifications through channel bots, which can expose internal errors, corrections, feature requests, and workflow metadata to external messaging infrastructure or unintended accounts. Without a privacy warning, users may not realize operational or sensitive workspace information is being propagated outside the local agent environment.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The installer overwrites existing learnings files when it decides their format is 'outdated', replacing user-maintained content without an explicit warning, backup, or migration path. Silent destructive writes to workspace data can cause data loss and can also be abused to replace trusted operational context with attacker-controlled content.

Natural-Language Policy Violations

High

Confidence: 96% confidence
Finding: The installer injects new behavioral instructions into AGENTS.md and memory.md for every agent workspace without opt-in, and the injected text is Chinese-only, reducing transparency for users who cannot read it. Because these files influence agent behavior, silent instruction injection can persistently steer future agent actions in ways the user may not notice or understand.

Missing User Warnings

Medium

Confidence: 86% confidence
Finding: The guide includes a destructive rm -rf command as part of agent deletion without an explicit warning, confirmation step, or safer alternative. In an operational skill that manages per-agent workspaces, users may copy-paste this command and permanently delete data if the agent_id or path is wrong.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The instructions explicitly require deleting pending JSON files after processing, but they do not require any user-visible confirmation that a reply will cause data mutation and removal of queued items. In this skill’s context, those JSON files represent workflow state for future actions, so silent deletion can cause loss of auditability, accidental processing, and irreversible state changes if the user did not intend bulk resolution.

Missing User Warnings

High

Confidence: 97% confidence
Finding: A single A/B/C/D reply is defined to process all JSON files in the pending directory, but the user is not warned that one short response applies globally rather than to a single item. In a per-agent self-improvement loop, this creates a high risk of unintended mass creation, modification, promotion, dormancy, or resolution across many queued patterns, amplifying the impact of accidental or ambiguous input.

Missing User Warnings

Medium

Confidence: 87% confidence
Finding: The script’s default mode performs writes and cleanup automatically without an interactive confirmation gate, including appending archive data and removing entries from source Markdown files. In an agent skill context, silent state-changing behavior is riskier because it can be triggered as part of automation, making accidental or unintended data mutation more likely and harder for operators to notice.

Vague Triggers

Medium

Confidence: 93% confidence
Finding: This cron job is enabled by default and runs every 3 hours against the current session with no explicit gating, scope restriction, or exclusion logic. In this skill’s context, the scheduled payload does more than passive checking: it invokes shell scripts, writes notification state, archives records, and emits messages, so an always-on trigger increases the chance of repeated unintended actions, noisy automation, or abuse if the underlying workspace data is poisoned.

Ssd 3

Medium

Confidence: 91% confidence
Finding: The documentation explicitly normalizes automatic cross-session capture and retention of user corrections and feedback. In the context of an agent skill, this creates privacy and prompt-injection persistence risk because sensitive user content and adversarial instructions may be stored long-term and later reintroduced into future sessions or agent decisions.

Ssd 3

Medium

Confidence: 94% confidence
Finding: This section describes intercepting user messages and routing captured correction content into per-agent learning stores. In this skill's context, that is more dangerous because captured content can influence later automated A/B/C/D workflows, enabling persistent prompt injection, privacy leakage between sessions, and tainted decision-making in agent execution.

Ssd 3

Medium

Confidence: 90% confidence
Finding: The quick-start and component reference operationalize persistent logs of corrections, errors, and feature requests, making the risky data-retention behavior easy to deploy. In an agent framework, such durable logs can accumulate secrets, personal data, and malicious prompt content that later contaminates agent memory or downstream automations.

VirusTotal

VirusTotal findings are pending for this skill version.

View on VirusTotal