developer-self-improve-core

Security checks across malware telemetry and agentic risk

Overview

This skill mostly matches its stated memory-helper purpose, but it includes an under-scoped auto-confirm path that can persistently change future agent behavior from free-form message text.

Install only if you are comfortable with a skill that can store persistent rules affecting future agent behavior. Avoid wiring auto-confirm to chat or DingTalk replies unless you add a stricter approval step, keep reminders disabled until tested, review proposal files before confirming, and back up the memory directory before enabling cleanup.

SkillSpector

By NVIDIA

Vulnerability Patterns

MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code

Findings (2)

Description-Behavior Mismatch

High

Confidence: 97% confidence
Finding: The script can write directly into the long-term rules store via confirm_rule, which contradicts the stated safety principle that AI only proposes and humans make the final decision. More importantly, the auto_confirm path can derive approval intent from free-form input and then invoke this state-changing function, so long-term memory can be modified based on ambiguous or spoofed text rather than an explicit, separately validated human approval step.

Intent-Code Divergence

Medium

Confidence: 89% confidence
Finding: The help text markets the skill as “安全可控” and describes auto-confirm as simple intent recognition, but that command performs a real state-changing action on long-term memory. This is dangerous because it can mislead operators into invoking a high-impact command they may believe is merely advisory, reducing scrutiny around irreversible or persistent changes.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal