developer-self-improve-core

Security checks across malware telemetry and agentic risk

Overview

This skill mostly matches its stated memory-helper purpose, but it includes an under-scoped auto-confirm path that can persistently change future agent behavior from free-form message text.

Install only if you are comfortable with a skill that can store persistent rules affecting future agent behavior. Avoid wiring auto-confirm to chat or DingTalk replies unless you add a stricter approval step, keep reminders disabled until tested, review proposal files before confirming, and back up the memory directory before enabling cleanup.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
  • Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Findings (2)

Description-Behavior Mismatch

High
Confidence
97% confidence
Finding
The script can write directly into the long-term rules store via confirm_rule, which contradicts the stated safety principle that AI only proposes and humans make the final decision. More importantly, the auto_confirm path can derive approval intent from free-form input and then invoke this state-changing function, so long-term memory can be modified based on ambiguous or spoofed text rather than an explicit, separately validated human approval step.

Intent-Code Divergence

Medium
Confidence
89% confidence
Finding
The help text markets the skill as “安全可控” and describes auto-confirm as simple intent recognition, but that command performs a real state-changing action on long-term memory. This is dangerous because it can mislead operators into invoking a high-impact command they may believe is merely advisory, reducing scrutiny around irreversible or persistent changes.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal