Robust Agent Design

Security checks across malware telemetry and agentic risk

Overview

This is a coherent instructional skill for robust agent design, with example-code hardening issues but no hidden install behavior or exfiltration evidence.

Install this only if you want design guidance and sample code for fault-tolerant agents. Before reusing the Python templates in real systems, move state files out of shared /tmp paths, use restrictive permissions, avoid persisting secrets or raw task data, add retention/redaction rules, require explicit approval for real email/SMS/payment actions, and fix the rollback examples with tests.

SkillSpector

By NVIDIA

Vulnerability Patterns

Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration

Findings (5)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 95% confidence
Finding: The skill includes explicit file persistence behavior such as `save_state(state_data, self.state_persistence)` and defaults state storage to `file`, which implies file read/write capability despite no declared permissions. In an agent ecosystem, undeclared filesystem access weakens trust boundaries and can lead to unexpected data exposure, tampering, or persistence of sensitive operational state.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The rollback mechanism passes the original forward-action parameters into rollback callbacks rather than rollback-specific identifiers or the action result. This breaks compensation semantics: rollback functions may receive the wrong arguments, fail unexpectedly, or be unable to undo side effects, leaving the system in a partially completed state after an error.

Intent-Code Divergence

Medium

Confidence: 97% confidence
Finding: The order-processing example does not propagate outputs between steps, so later actions and compensations lack the real order ID and other execution artifacts required for correct recovery. As written, payment uses a placeholder order ID and rollback depends on variables that are never assigned, making compensation unreliable and potentially leaving orphaned orders or inconsistent inventory state.

Vague Triggers

Medium

Confidence: 87% confidence
Finding: The trigger text is very broad, covering general 'Agent architecture', 'state management', 'retry mechanisms', and 'system robustness improvements', which could cause the skill to activate on many unrelated design requests. Over-broad invocation increases the chance that users receive this skill's patterns in contexts where they are unnecessary, potentially introducing unintended file persistence or recovery logic into systems that did not require them.

Missing User Warnings

Low

Confidence: 88% confidence
Finding: The agent persists state to a file in /tmp and includes task-derived metadata such as agent name, timing, retry details, and a checksum of input data without any disclosure, access control, or secure file handling. Even though only a checksum is stored rather than raw input, writing operational metadata to a world-accessible temporary location can leak workflow details, enable correlation of sensitive tasks, or expose state files to other local users on shared systems.

VirusTotal

59/59 vendors flagged this skill as clean.

View on VirusTotal