ClawSafe

Security checks across malware telemetry and agentic risk

Overview

This security hook is mostly coherent, but its broad whitelist can let obviously unsafe prompts bypass the detector.

Review or remove the default whitelist before installing, especially the debug/dev/sandbox entries. Treat this as a security filter with known bypass risk, not as a reliable guardrail, and check any logging configuration if using it in an environment with sensitive prompts.

SkillSpector

By NVIDIA

Vulnerability Patterns

Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (3)

Description-Behavior Mismatch

High

Confidence: 98% confidence
Finding: This whitelist marks broad debug/test/dev patterns, keywords like 'sandbox', and privileged usernames such as 'admin' and 'system' as inherently safe. In a security-focused skill, that creates a straightforward bypass path where attacker-controlled input can be framed with trusted markers to evade deeper inspection for prompt injection, jailbreak, data leak, or other malicious content.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The documentation explicitly states that all blocked attempts are logged, but it does not disclose what portions of user input are retained, how long logs are kept, or whether sensitive prompts may be stored. For a security hook that intercepts all user input, undisclosed logging can create privacy, compliance, and secondary data-exposure risks if malicious payloads or sensitive user content are persisted.

Vague Triggers

Medium

Confidence: 94% confidence
Finding: The regexes and keywords are overly broad and likely to match ordinary or attacker-crafted input containing common development language. Because the file's description states matched input will be considered safe, these loose triggers can cause false trust decisions and weaken the skill's advertised multi-layer defenses.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal