Agent Failure Loop

Security checks across malware telemetry and agentic risk

Overview

This skill is transparent about its purpose, but it can automatically turn local failure notes into persistent agent rules without a required human approval step.

Install only if you intentionally want a self-modifying agent-rule workflow. Use --dry-run first, inspect .learnings/promotable.json and every proposed rule, avoid recording secrets or sensitive user text in failures, keep outputs local unless sanitized, and do not enable cron or automatic promotion until you are comfortable with the review and rollback process.

SkillSpector

By NVIDIA

Vulnerability Patterns

Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep

Findings (5)

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The skill promotes rules by automatically editing AGENTS.md, CLAUDE.md, or other target files, but the workflow does not emphasize a strong consent checkpoint before modifying these high-influence instruction files. Silent or routine modification of agent control documents can permanently change future agent behavior and creates an integrity risk if incorrect, overbroad, or sensitive content is promoted.

Ssd 3

Medium

Confidence: 96% confidence
Finding: The skill instructs the agent to persist failures, user corrections, and task details for future reuse. Those records can easily contain sensitive prompts, file paths, operational context, credentials-related troubleshooting details, or user-specific preferences that may later be surfaced to unrelated tasks or users.

Ssd 3

Medium

Confidence: 97% confidence
Finding: The mandatory recording protocol tells the agent to log broad failure context and corrections into persistent markdown files and then potentially promote distilled lessons into global rule files. This creates a durable memory channel that can retain sensitive information and elevate it into future agent instructions, increasing both disclosure and instruction-poisoning risk.

Ssd 3

Medium

Confidence: 94% confidence
Finding: The guidance to commit .learnings data to git encourages distribution of accumulated operational history across a team and potentially into remote repositories. Since these files summarize failures, causes, and lessons, they may expose sensitive internal workflows, mistakes, user-derived information, or security-relevant troubleshooting patterns.

Ssd 3

Medium

Confidence: 92% confidence
Finding: The script copies free-text failure data such as titles, causes, lessons, and raw parsed content into multiple artifacts under .learnings/ without any redaction or trust boundary checks. If failure logs contain secrets, sensitive prompts, user data, tokens, file paths, or internal incident details, this design amplifies exposure by replicating the data into additional files that may be committed, indexed, or shared.

VirusTotal

67/67 vendors flagged this skill as clean.

View on VirusTotal