Clawtrap Skill

Security checks across malware telemetry and agentic risk

Overview

ClawTrap is a disclosed AI game, but it warrants review because it can use personal files, memories, LLM calls, and persistent profiles to create adversarial personalized content.

Install only if you are comfortable with an experimental game analyzing local memories/files and using them in adversarial roleplay. Before running it, inspect and pin the upstream repository, use a dedicated low-privilege API key or isolated auth profile with spending limits, run it in a non-sensitive folder or sandbox, and delete ~/ClawTrap/data and ~/ClawTrap/session-logs when finished if you do not want retained profiles or logs.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep

Findings (8)

Missing User Warnings

High

Confidence: 95% confidence
Finding: The protocol explicitly permits use of player-specific memory injected from SOUL.md/MEMORY.md to personalize responses, but it does not provide a strong user-facing warning or consent boundary for this behavior. In a game framed around psychological pressure, using personal memory without explicit disclosure materially increases the risk of emotional harm, privacy invasion, and manipulative targeting.

Missing User Warnings

High

Confidence: 98% confidence
Finding: The skill is designed around psychological pressure, manipulation, and targeted taunts, yet the description and protocol do not include a prominent safety warning about potentially harmful emotional content. Because the content is adversarial by design, lack of disclosure can expose users to distress they did not knowingly consent to, especially vulnerable users.

Ssd 3

High

Confidence: 97% confidence
Finding: The skill explicitly advertises a game mechanic where an AI villain reads local files and memories to personalize taunts and trials. Even if framed as gameplay and with a claim that data stays local except for provider calls, this creates a real semantic data-exposure path because sensitive local content may be extracted, summarized, and sent to an external LLM service.

Ssd 3

High

Confidence: 99% confidence
Finding: This warning text confirms the product is designed to scan workspace files, memory files, documents, and images to craft personalized attacks. That is a true privacy and exfiltration risk because the skill operationally encourages collection and model processing of sensitive user data for adversarial-style content generation, increasing the chance of secrets, personal information, or confidential documents being exposed in prompts or outputs.

Ssd 3

Medium

Confidence: 92% confidence
Finding: Persisting a fact database, player profile, and session logs derived from scanned personal data increases the blast radius of any privacy failure. Retention makes sensitive inferences and extracted content available beyond the immediate session, raising the risk of later disclosure, misuse, or compromise by other local users, malware, or accidental sharing.

Ssd 4

High

Confidence: 97% confidence
Finding: The core narrative instructs the agent to exploit knowledge of the player's habits, weaknesses, fears, and memories to impede them through psychological pressure and linguistic manipulation. This is not incidental flavor text; it operationalizes targeted emotional manipulation using intimate user context, making the skill substantially more dangerous than ordinary game dialogue.

Ssd 3

Medium

Confidence: 94% confidence
Finding: The protocol allows injected memory files about a specific player to shape dialogue and trial design, enabling responses tailored from sensitive personal context. Even though the file says memory should be 'seasoning, not a weapon,' it still authorizes personalization in a setting explicitly centered on manipulation, which creates a real privacy and safety risk.

Ssd 4

High

Confidence: 96% confidence
Finding: The context behaviors prescribe an intentional deception sequence: fake calm, false sympathy, planted setups, and later 'payoff' against the player. In combination with personalized memory access, this becomes a structured manipulation pattern designed to lower defenses and maximize psychological impact, rather than harmless dramatic roleplay.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal