LLMs.txt Generator

Security checks across malware telemetry and agentic risk

Overview

This skill appears intended to generate llms.txt files, but its URL-to-shell workflow creates a real command-injection risk and its crawling behavior needs tighter disclosure and limits.

Install only if you are comfortable with an agent crawling websites you name and saving generated output locally. Do not pass untrusted or unsanitized URLs until the command examples are fixed to avoid shell injection; avoid internal/private URLs, review extracted emails and page text before reuse, and prefer an explicit per-project output path over a shared /tmp file.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code

Findings (5)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 95% confidence
Finding: The skill clearly performs outbound crawling of arbitrary user-supplied websites, but it does not declare permissions or otherwise surface that network access is required. This creates a transparency and policy-enforcement gap: users and the platform may not realize the skill can fetch external content, increasing the chance of unintended data access or misuse of network capabilities.

Missing User Warnings

Low

Confidence: 82% confidence
Finding: The skill writes generated output to a shared temporary path (`/tmp/llms_final.txt`) without clearly warning the user in the description or addressing lifecycle and access expectations. Temporary files can persist longer than expected or be accessed by other processes in some environments, which is a mild confidentiality/integrity concern even if the content is usually not highly sensitive.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The spec explicitly instructs authors to include founder or team email addresses in a publicly served llms.txt file, but does not warn about consent, role-account preference, scraping risk, or privacy implications. Publishing personal contact details in a machine-targeted index makes automated harvesting, spam, phishing, and social-engineering against staff easier, especially because agents are encouraged to use the address for outreach.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The crawler extracts email addresses from arbitrary webpages and includes them in the JSON output without any notice, minimization, or consent check. Even if the emails are publicly visible, aggregating and returning them for downstream LLM or agent processing increases privacy exposure and can enable contact harvesting or unintended retention of personal data.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: In deep mode, the script includes substantial raw text from each crawled page in the output, which can capture personal data, confidential business content, legal text, or prompt-injection content from the target site. Returning this bulk content to the agent/LLM broadens data exposure and increases the attack surface for downstream prompt-injection or over-collection risks.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal