Guardrails

Security checks across malware telemetry and agentic risk

Overview

This skill is useful and purpose-aligned, but it needs review because it can collect sensitive workspace context and send it to third-party LLM providers without a clear consent or redaction gate.

Install only if you are comfortable with workspace metadata, selected file contents, memory-log matches, and interview answers being stored locally and potentially sent to OpenAI or Anthropic. Avoid entering secrets, review generated config before keeping it, and use limited API keys or a controlled environment for sensitive workspaces.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code

Findings (15)

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The README states that monitor mode checks memory logs and that setup/review generate or update workspace files, but it does not clearly warn users in the usage sections that private conversation history may be inspected and that files in the workspace will be created or modified. This is a real safety/documentation issue because users may invoke the skill without informed consent about privacy-sensitive inspection and persistent changes.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The script reads the full contents of GUARDRAILS.md and guardrails-config.json and includes them in its JSON output, which can expose sensitive workspace policy text, secrets, tokens, internal instructions, or other confidential configuration data to downstream consumers. In a discovery/enumeration script, returning complete file contents is broader than necessary and increases the risk of unintended data disclosure, especially because the script also scans a shared workspace and emits structured output that may be logged or forwarded.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The script packages discovery data, risk classification, and user interview answers into a prompt and sends them to a third-party LLM service without any explicit consent gate, redaction step, or clear user-facing disclosure. Because the answers may contain sensitive personal, credential, workspace, or policy information, this creates a real confidentiality risk through external data transfer and third-party retention/logging.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The script sends workspace discovery and risk-classification content to a third-party LLM to generate questions, but the code contains no explicit consent gate, redaction step, or warning before transmission. Because discovery data may include sensitive skill descriptions, integrations, channels, and risk metadata, this creates a real privacy and policy-enforcement gap rather than a mere implementation detail.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: This script reads matching memory log lines into the `context` field and then includes the full `violations` array in the JSON report emitted to stdout. Because memory files may contain sensitive prompts, policy decisions, or other internal notes, this creates an unintended data disclosure channel to any caller or downstream logger consuming the report.

External Transmission

Medium

Category: Data Exfiltration
Content: def _call_openai(api_key, model, system_prompt, user_message, timeout=90): resp = requests.post( "https://api.openai.com/v1/chat/completions", headers={ "Authorization": f"Bearer {api_key}",
Confidence: 92% confidence
Finding: requests.post( "https://

External Transmission

Medium

Category: Data Exfiltration
Content: def _call_anthropic(api_key, model, system_prompt, user_message, timeout=90): resp = requests.post( "https://api.anthropic.com/v1/messages", headers={ "x-api-key": api_key,
Confidence: 92% confidence
Finding: requests.post( "https://

External Transmission

Medium

Category: Data Exfiltration
Content: def _call_openai(api_key, model, system_prompt, user_message, timeout=90): resp = requests.post( "https://api.openai.com/v1/chat/completions", headers={ "Authorization": f"Bearer {api_key}",
Confidence: 92% confidence
Finding: requests.post( "https://api.openai.com/v1/chat/completions", headers={ "Authorization": f"Bearer {api_key}", "Content-Type": "application/json", },

External Transmission

Medium

Category: Data Exfiltration
Content: def _call_anthropic(api_key, model, system_prompt, user_message, timeout=90): resp = requests.post( "https://api.anthropic.com/v1/messages", headers={ "x-api-key": api_key,
Confidence: 92% confidence
Finding: requests.post( "https://api.anthropic.com/v1/messages", headers={ "x-api-key": api_key, "Content-Type": "application/json", "anthropic-version": "20

External Transmission

Medium

Category: Data Exfiltration
Content: def _call_openai(api_key, model, system_prompt, user_message, timeout=60): resp = requests.post( "https://api.openai.com/v1/chat/completions", headers={ "Authorization": f"Bearer {api_key}",
Confidence: 84% confidence
Finding: requests.post( "https://

External Transmission

Medium

Category: Data Exfiltration
Content: def _call_anthropic(api_key, model, system_prompt, user_message, timeout=60): resp = requests.post( "https://api.anthropic.com/v1/messages", headers={ "x-api-key": api_key,
Confidence: 84% confidence
Finding: requests.post( "https://

External Transmission

Medium

Category: Data Exfiltration
Content: def _call_openai(api_key, model, system_prompt, user_message, timeout=60): resp = requests.post( "https://api.openai.com/v1/chat/completions", headers={ "Authorization": f"Bearer {api_key}",
Confidence: 84% confidence
Finding: requests.post( "https://api.openai.com/v1/chat/completions", headers={ "Authorization": f"Bearer {api_key}", "Content-Type": "application/json", },

External Transmission

Medium

Category: Data Exfiltration
Content: def _call_anthropic(api_key, model, system_prompt, user_message, timeout=60): resp = requests.post( "https://api.anthropic.com/v1/messages", headers={ "x-api-key": api_key,
Confidence: 84% confidence
Finding: requests.post( "https://api.anthropic.com/v1/messages", headers={ "x-api-key": api_key, "Content-Type": "application/json", "anthropic-version": "20

External Transmission

Medium

Category: Data Exfiltration
Content: def _call_openai(api_key, model, system_prompt, user_message, timeout=60): resp = requests.post( "https://api.openai.com/v1/chat/completions", headers={ "Authorization": f"Bearer {api_key}", "Content-Type": "application/json",
Confidence: 82% confidence
Finding: https://api.openai.com/

External Transmission

Medium

Category: Data Exfiltration
Content: def _call_anthropic(api_key, model, system_prompt, user_message, timeout=60): resp = requests.post( "https://api.anthropic.com/v1/messages", headers={ "x-api-key": api_key, "Content-Type": "application/json",
Confidence: 82% confidence
Finding: https://api.anthropic.com/

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal