perrytest

Security checks across malware telemetry and agentic risk

Overview

This looks like a legitimate self-improvement logging skill, but it needs review because it can persist conversation details, promote them into future agent instructions, and share learnings across sessions without enough privacy safeguards.

Install only if you are comfortable with an agent keeping long-term local notes from conversations and using them to influence future sessions. Prefer project-level setup over global hooks, review hook scripts before enabling them, and do not allow raw prompts, secrets, credentials, customer data, personal data, or full transcripts to be written into .learnings or promoted into agent memory files.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (8)

Intent-Code Divergence

Medium

Confidence: 98% confidence
Finding: The document states that the hook scripts 'only output text' and 'don't modify files or run commands,' but the configuration explicitly executes shell scripts as hook commands. This misrepresents the trust boundary and can cause operators to enable executable hooks under a false sense of safety, increasing the chance of arbitrary code execution if those scripts are changed, replaced, or behave unexpectedly.

Vague Triggers

Medium

Confidence: 82% confidence
Finding: The manifest description defines activation in very broad terms, including many normal failures, corrections, and realizations. In agents that auto-load skills from descriptions, this can cause the skill to trigger on routine interactions and unnecessarily write persistent logs or influence behavior far more often than users expect.

Vague Triggers

Medium

Confidence: 89% confidence
Finding: The detection triggers rely on common conversational phrases like corrections, feature asks, and uncertainty cues that appear frequently in benign chat. In an auto-activation environment, this can lead to over-triggering, excessive persistence of conversational content, and unintended collection of user statements into project memory.

Vague Triggers

Medium

Confidence: 88% confidence
Finding: An empty matcher causes the hook to fire on every prompt, which broadens the trigger surface and makes the self-improvement mechanism effectively always-on. In this skill context, that increases prompt-scope persistence and the chance that sensitive or irrelevant user content is repeatedly processed by hook logic, especially if the script behavior evolves beyond simple text output.

Vague Triggers

Medium

Confidence: 90% confidence
Finding: The user-level configuration enables the hook globally for all sessions without meaningful trigger constraints. That expands persistence across projects and contexts, making accidental cross-context influence and broad exposure of prompt content more likely if the hook script is unsafe, compromised, or simply too aggressive.

Ssd 3

Medium

Confidence: 90% confidence
Finding: The skill instructs the agent to persist user corrections and conversation-derived learnings into local files and potentially promote them into durable memory files. That creates a clear data retention risk because natural-language interactions may contain secrets, proprietary information, personal data, or sensitive business context that should not be stored long-term.

Ssd 3

High

Confidence: 95% confidence
Finding: The cross-session tooling explicitly encourages reading other sessions' transcripts and forwarding learnings between sessions. If those transcripts contain secrets or sensitive user data, this pattern can spread information beyond its original context and violate least-privilege and data-isolation expectations.

Ssd 3

High

Confidence: 96% confidence
Finding: The logging templates request full context, inputs, parameters, environment details, and user context, all of which commonly contain API keys, tokens, internal paths, customer data, or operational secrets. Because the skill persists this material in markdown files, it materially increases the chance of sensitive-data retention and later leakage through commits, sharing, or indexing.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal