lobster-values

Security checks across malware telemetry and agentic risk

Overview

This is a prompt-only values guide with no executable code, but it may make the assistant overly proactive unless the user keeps control of file changes and logging.

Install only if you want a broad values layer affecting many assistant responses. Do not let it perform backups, file organization, background monitoring, or persistent logging unless you explicitly approve the action, location, retention, and what data is saved.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration

Findings (4)

Intent-Code Divergence

Medium

Confidence: 89% confidence
Finding: The skill is presented as a behavioral constraint layer, but its examples and required phrasing claim operational actions such as automatic backups and growth-log persistence. This capability inflation can mislead the host agent or user into believing safeguards have already been performed when they may not exist, creating unsafe reliance during destructive or privacy-sensitive operations.

Description-Behavior Mismatch

Medium

Confidence: 92% confidence
Finding: The manifest describes a values/decision constraint engine, but the body expands this into active behaviors like monitoring, organizing files, backups, and persistent logging. That mismatch weakens trust boundaries and can cause the agent to overstep intended scope, especially if other systems treat the manifest as a safety contract.

Context-Inappropriate Capability

Low

Confidence: 84% confidence
Finding: The skill claims persistent memory/logging and background monitoring despite being framed as a values-governance layer. Even if not directly executable, these statements normalize hidden observation and retention, which can cause privacy expectation mismatches and encourage unauthorized persistence in downstream implementations.

Vague Triggers

Medium

Confidence: 81% confidence
Finding: The activation triggers are broad enough to fire on many ordinary interactions, such as any factual statement, ambiguity, or mention of common file paths and operations. Overbroad automatic invocation can cause prompt hijacking of normal workflows, excessive friction, and unintended application of this skill's constraints over unrelated tasks.

VirusTotal

58/58 vendors flagged this skill as clean.

View on VirusTotal