Retro

Security checks across malware telemetry and agentic risk

Overview

The skill is a mostly legitimate retrospective tool, but it can run project commands and make persistent repository or shared-log changes without clear user approval.

Install only if you are comfortable with a retrospective skill that can read project history/logs, run configured build or test commands, edit files, and create commits. Use it on trusted repositories or in a sandbox, review any CLAUDE.md and evolution-log changes before they are written, and avoid the factory/evolution-log phase unless you want cross-project notes persisted.

SkillSpector

By NVIDIA

Vulnerability Patterns

Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration

Findings (8)

Description-Behavior Mismatch

High

Confidence: 97% confidence
Finding: The skill expands from retrospective analysis into directly modifying CLAUDE.md and instructing a git commit, which changes persistent project state beyond the narrowly implied 'retro' function. This is dangerous because a user invoking a postmortem tool may not expect documentation rewrites and commit-oriented actions, enabling unintended repository modifications and durable instruction drift.

Description-Behavior Mismatch

Medium

Confidence: 94% confidence
Finding: The optional factory-critic phase broadens scope from analyzing the current project's pipeline to auditing external factory assets and writing evolution logs, including possible global files. That scope expansion increases the blast radius and can cause unauthorized inspection or modification of tooling outside the user's immediate intent.

Context-Inappropriate Capability

Medium

Confidence: 89% confidence
Finding: The skill directs running tests and builds during fallback analysis, which turns a retrospective/log-review tool into an execution-capable workflow. Executing project-defined commands can trigger arbitrary code, network access, or destructive scripts, especially in untrusted repositories.

Intent-Code Divergence

High

Confidence: 96% confidence
Finding: The skill's description frames patching as suggestive, but later instructions tell the agent to apply edits directly. This mismatch is dangerous because it defeats user expectations and can lead to silent repository changes under the guise of analysis-only behavior.

Vague Triggers

Medium

Confidence: 84% confidence
Finding: The trigger phrase 'retro' is extremely broad and likely to match casual conversation, increasing the chance that the skill activates unexpectedly. Because the skill has Write/Edit/Bash capabilities and can evolve into modifying docs and running commands, accidental invocation materially raises risk.

Missing User Warnings

High

Confidence: 98% confidence
Finding: The skill instructs writing and committing CLAUDE.md changes without an explicit user warning or approval checkpoint. Persistent changes to guidance files can alter future agent behavior and encode sensitive or incorrect information, while the commit step further cements unintended changes.

Ssd 3

Medium

Confidence: 93% confidence
Finding: Appending findings to project and especially global evolution logs creates a natural-language persistence channel for pipeline details, defects, and contextual observations. This can leak sensitive information across projects and make prior analysis available in later contexts where it does not belong.

Ssd 3

Medium

Confidence: 92% confidence
Finding: Revising CLAUDE.md with 'learnings' from the retro can preserve run-specific failures, operational context, or sensitive details in a durable instruction file consumed by future agents. This creates semantic data retention and can contaminate future sessions with unnecessary or confidential context.

VirusTotal

60/60 vendors flagged this skill as clean.

View on VirusTotal