Safe Skill Evolver

Security checks across malware telemetry and agentic risk

Overview

This skill is a disclosed, user-confirmed helper for creating and reviewing OpenClaw skills, with no evidence of hidden execution, exfiltration, or destructive behavior.

Install only if you want an agent helper that can propose and, after your approval, write or update skill files. Review diffs carefully, use unambiguous approvals, and avoid approving changes that weaken confirmation, safety, or audit rules.

SkillSpector

By NVIDIA

Vulnerability Patterns

Rogue AgentSelf-Modification, Session Persistence
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (7)

Vague Triggers

Low

Confidence: 84% confidence
Finding: The invocation example "Create a skill for [task]" is broad enough that ordinary user requests about creating something could unintentionally trigger this skill. In a system with automatic skill routing, ambiguous activation boundaries can cause the skill to engage outside its intended scope, leading to unnecessary file-generation or modification proposals.

Vague Triggers

Low

Confidence: 87% confidence
Finding: Examples like "Improve the [skill-name] skill" or "[skill] has errors" are ambiguous because they do not clearly constrain whether the request refers to a registered skill artifact, a general capability, or a troubleshooting conversation. This can cause the skill to activate during ordinary debugging or quality-improvement requests and produce unintended change recommendations against workspace files.

Vague Triggers

Low

Confidence: 82% confidence
Finding: The audit trigger "Audit the [skill-name] skill" is somewhat generic and could overlap with ordinary requests to review or assess something called a skill, especially in broader agent or developer contexts. Even though the described mode is read-only, unintended activation can still expose internal files to analysis or create confusion about what object is being inspected.

Natural-Language Policy Violations

Medium

Confidence: 93% confidence
Finding: The skill hard-codes German confirmation phrases ("ja", "bestätigen") alongside English variants without defining a language-agnostic confirmation mechanism. This can cause ambiguous consent handling, especially in multilingual sessions, and may lead to accidental application of changes if a phrase is misinterpreted as approval.

Natural-Language Policy Violations

Medium

Confidence: 95% confidence
Finding: The rule says to stop immediately if the user says "nein," but does not provide equivalent rejection handling for other languages or ambiguous refusals. In multilingual use, this can weaken the safety boundary by failing to recognize a denial, increasing the chance that modifications proceed despite lack of consent.

Natural-Language Policy Violations

Medium

Confidence: 94% confidence
Finding: The template hard-codes German user-facing phrases without any language-selection mechanism or documented locale restriction. In a generic workflow skill template, this can cause user misunderstanding at critical confirmation or error-recovery points, which may lead to incorrect approvals, unsafe choices, or failure to understand safety guidance.

Self-Modification

High

Category: Rogue Agent
Content: - **Suggest:** Generate proposed changes as a clear diff or structured preview. - **Review:** Explain *why* each change is recommended (rationale). - **Confirm:** Wait for explicit user approval ("ja", "ok", "apply", "bestätigen"). - **Apply:** Only then write files, create directories, or modify skills. ---
Confidence: 88% confidence
Finding: modify skill

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal