Self Improving Agent

Security checks across malware telemetry and agentic risk

Overview

This skill is a disclosed local note-taking and reminder workflow for agent learnings, with opt-in hooks and no evidence of exfiltration or hidden destructive behavior.

Install this if you want a local workflow for recording agent mistakes, corrections, and useful patterns. Keep hooks project-scoped where possible, review hook scripts before enabling them, avoid logging secrets or full transcripts, and use the PostToolUse error detector only if you are comfortable with it reading command output for error patterns.

SkillSpector

By NVIDIA

Vulnerability Patterns

Rogue AgentSelf-Modification, Session Persistence
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration

Findings (4)

Intent-Code Divergence

Medium

Confidence: 92% confidence
Finding: The guide's security section states that hook scripts only output text and do not modify files or run commands, but the same document references an extract script that creates a skill scaffold. This creates a misleading trust boundary: operators may enable these hooks believing they are non-mutating, when the documented tooling can perform filesystem changes or execute additional logic with the agent's permissions.

Vague Triggers

Medium

Confidence: 80% confidence
Finding: The activation guidance is broad enough that the skill may be invoked during many ordinary interactions, not just true error/correction events. In practice this can create excessive automatic logging, prompt pollution, and unnecessary writes to persistent files, especially when combined with hooks.

Vague Triggers

Medium

Confidence: 84% confidence
Finding: The detection triggers rely on common phrases like 'Can you also' or 'Actually,' which frequently appear in benign conversation. Without stronger scope constraints, this can misclassify routine dialogue as learnings, errors, or feature requests and cause persistent recording of user interactions that were never intended for storage.

Session Persistence

Medium

Category: Rogue Agent
Content: ### Option 1: Project-Level Configuration Create `.claude/settings.json` in your project root: ```json {
Confidence: 84% confidence
Finding: Create `.claude/settings.json` in your project root: ```json { "hooks": { "UserPromptSubmit": [ { "matcher": "", "hooks": [ { "type": "command",

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal