Security audit

AutoSpec

Security checks across malware telemetry and agentic risk

Overview

This is a markdown-only spec-writing skill with no install-time execution or hidden runtime behavior, though generated specs and code still need normal security review.

Installers should understand that this skill guides the agent to inspect relevant code and produce specs in chat. Review any generated spec or implementation before adopting it, especially for features involving conversation history, external URLs, network validation, persistence, or sensitive data.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (9)

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: The function performs live outbound HEAD requests to user-supplied URLs as part of 'validation', which creates network side effects unrelated to a spec-writing or reverse-spec skill. This can enable SSRF-style access to internal services, leak network metadata, and cause unintended external traffic from the host environment.

Description-Behavior Mismatch

Medium

Confidence: 97% confidence
Finding: The implementation is unrelated to the declared skill purpose and instead processes image URLs and performs remote validation, indicating capability drift from the stated spec-assistant behavior. In this context, hidden or unjustified functionality is more dangerous because users and reviewers would not expect network-touching behavior from a specification helper, increasing the chance of covert misuse or accidental deployment of risky logic.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The function launches concurrent outbound HTTP requests to attacker-controlled URLs for validation, creating network side effects that are unrelated to the stated role of a spec-writing/reverse-spec assistant. This expands the skill's capabilities into live external probing and can enable SSRF-style access to internal services, metadata endpoints, or other network resources if user-supplied URLs are processed.

Context-Inappropriate Capability

Medium

Confidence: 97% confidence
Finding: This code performs actual HTTP HEAD requests to arbitrary external URLs and treats successful responses as validation, which is an unjustified capability for a spec-authoring/code-understanding skill. In the context of this skill, the mismatch makes the behavior more dangerous because it can be abused to probe internal or sensitive endpoints while appearing as ordinary URL checking logic.

Vague Triggers

Medium

Confidence: 82% confidence
Finding: The trigger phrases are fairly broad, including generic prompts like wanting to think through a feature or asking what code does. This can cause unintended activation, which may steer an agent into the wrong workflow or make it read more repository context than necessary, but it does not by itself create code execution or data exfiltration risk.

Missing User Warnings

Medium

Confidence: 85% confidence
Finding: The spec introduces automatic server-side LLM summarization of prior conversation history, which can cause users' earlier messages—including potentially sensitive content—to be reprocessed by another model path without explicit notice, consent, or documented data-handling controls. In an assistant context, history often contains secrets, personal data, or tool outputs, so silent secondary processing increases privacy and compliance risk even if the feature is intended for optimization.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The spec explicitly introduces automatic summarization of prior conversation history once a threshold is exceeded, but it does not mention any user notice, consent, or disclosure. Because chat history can contain sensitive or regulated data, silently transforming and re-injecting it via a summarization model can create privacy, transparency, and compliance risks, especially if users assume their exact prior messages are being preserved.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: The spec explicitly validates user-supplied image URLs by issuing outbound HTTP HEAD requests, which can disclose server/network metadata to third parties and may enable server-side request forgery style access to internal or sensitive network locations if implemented naively. Even though this is only a specification, the skill context increases risk because it encourages downstream code generation from the spec, propagating the unsafe behavior without requiring user consent, allowlisting, or privacy/security guardrails.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The spec explicitly instructs the implementation to issue HTTP HEAD requests to user-provided URLs, which creates a server-side request capability over attacker-controlled destinations. Even though this is framed as accessibility validation, it can enable SSRF-style probing of internal services, cloud metadata endpoints, or other restricted network locations if implemented without strict outbound request controls. In the context of an agent skill that may generate code from this spec, the danger is increased because the behavior is likely to be implemented verbatim.

VirusTotal

63/63 vendors flagged this skill as clean.

View on VirusTotal

Static analysis

Detected: suspicious.prompt_injection_instructions

Prompt-injection style instruction pattern detected.

Warn

Code: suspicious.prompt_injection_instructions
Location: evals/iteration-1/reverse-spec-module-level/with_skill/outputs/spec_output.md:94