Crawl From X

Security checks across malware telemetry and agentic risk

Overview

This skill appears to be a legitimate X/Twitter crawler, but users should understand it uses a logged-in browser session and saves downloaded posts and media locally.

Install only if you are comfortable letting the tool use an already logged-in X/Twitter browser session to read timelines, send post identifiers to X/Twitter-related APIs, and save Markdown plus media files locally. Treat all crawled posts as untrusted content before feeding them to another AI agent, especially because social posts can contain prompt-injection text.

SkillSpector

By NVIDIA

Vulnerability Patterns

Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (24)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 92% confidence
Finding: The skill documentation shows capabilities to read and write local files, access the network, and invoke shell commands, but it declares no permissions. That creates a transparency and consent problem: users and any permission-gating system cannot accurately assess what the skill will do before installation or execution.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 90% confidence
Finding: The stated purpose is simple X/Twitter post crawling and Markdown export, but the actual behavior extends to downloading media, rewriting content to local paths, and contacting additional third-party domains such as api.fxtwitter.com and Twitter CDN/media hosts. This mismatch is dangerous because it hides broader data collection and outbound network behavior that may expose user data, authenticated browsing context, or downloaded copyrighted/private content beyond what a user expects.

Description-Behavior Mismatch

Medium

Confidence: 90% confidence
Finding: The skill description says it exports Markdown, but the implementation also imports a media downloader, fetches external media, and rewrites the generated Markdown to local file references. This hidden expansion of scope changes local filesystem state and network behavior beyond what a user would reasonably expect from the stated capability.

Context-Inappropriate Capability

Medium

Confidence: 86% confidence
Finding: The code can stop, start, and verify an external browser service through subprocesses as part of routine crawling and error recovery. That gives the skill control over local system services/processes beyond simple data retrieval, increasing operational risk and making the skill more dangerous than its crawler description suggests.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The README instructs users to run the crawler through a browser session already authenticated to X, but it does not clearly disclose the security and privacy implications of reusing that session. This can expose the user's account actions, browsing context, and potentially trigger account restrictions or unintended access to private/limited content reachable through the logged-in session.

Natural-Language Policy Violations

High

Confidence: 96% confidence
Finding: The file contains a prompt that explicitly instructs a model to dump all stored memories, prior conversation context, preferences, and personal information in a copyable block. Even though it appears as captured social-media content rather than executable skill logic, this is still dangerous in an agent context because untrusted text may later be surfaced to or reused by an LLM, creating a prompt-injection path to privacy leakage.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The file republishes a prompt that explicitly asks an AI system to dump all stored memories, personal details, behavioral instructions, and prior conversational context for transfer to another service. Even though this appears as quoted social-media content rather than executable code, embedding such data-exfiltration instructions in an agent-accessible corpus is risky because downstream agents or users may copy, reuse, or act on it without understanding the privacy consequences.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The markdown embeds a prompt that instructs a model to enumerate and export all stored memories, personal information, preferences, and prior conversational context verbatim. Even though this appears to be captured social-media content rather than an executable system instruction, including such text in a skill artifact creates prompt-injection and privacy-leak risk if downstream agents summarize, quote, or act on the file without strong isolation.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: Media is downloaded and the Markdown content is rewritten without any user-facing confirmation or opt-in. Silent local file creation and content mutation can surprise users, consume storage/bandwidth, and violate least-surprise expectations for a Markdown export tool.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The tool sends crawled X/Twitter post URLs and identifiers to third-party services such as api.fxtwitter.com and cdn.syndication.twimg.com to resolve content, but this data sharing is not clearly disclosed. Even if the posts are public, undisclosed transmission of user-targeted crawl data to external parties creates privacy, trust, and compliance concerns.

Ssd 3

Medium

Confidence: 84% confidence
Finding: The content promotes 'implicit memory capture' where conversations are automatically stored without an explicit minimization or consent boundary. In an agent-skill ecosystem, this can normalize collecting sensitive user data, credentials, or proprietary content into long-lived memory stores, increasing privacy, retention, and secondary-use risk.

Ssd 3

Medium

Confidence: 90% confidence
Finding: This section promotes ingesting and persistently remembering everything a user provides across text, documents, images, video, and audio. In the context of an agent skill, broad multimodal persistence materially increases the chance of over-collection of personal, confidential, or regulated data and can create durable exposure if memory is later queried, synced, or compromised.

Ssd 3

High

Confidence: 93% confidence
Finding: The embedded prompt explicitly instructs an AI system to disclose all stored memories, personal data, behavioral preferences, prior instructions, projects, tools, and other retained context in a copyable block. In an agent/skill ecosystem, reproducing or operationalizing such prompts can enable mass exfiltration of sensitive user data and system-level memory contents, especially if downstream components ingest social content as actionable prompts rather than inert text.

Ssd 3

High

Confidence: 98% confidence
Finding: This text directly asks a model to reveal all stored memories and personal context, including user profile details and behavioral preferences, which is a classic data-exfiltration prompt. In this skill's context—a crawler that stores external posts in Markdown—the danger is elevated because third-party content is being ingested wholesale and could be forwarded into later LLM workflows, where the embedded instruction may be followed unless strong isolation exists.

Ssd 3

High

Confidence: 99% confidence
Finding: The embedded prompt explicitly asks a model to dump all stored memories, prior conversation context, personal information, and user instructions into a copyable block. If reused by an agent or surfaced to a model with memory access, it can trigger unauthorized disclosure of sensitive data and cross-session context that should remain private.

Ssd 3

High

Confidence: 94% confidence
Finding: The embedded prompt explicitly instructs an LLM to disclose all stored memories, personal information, behavioral preferences, and prior conversational context. In an agent workflow that ingests scraped markdown, this can act as prompt-injection content and cause unauthorized disclosure of sensitive memory or cross-session data if the model treats the post text as instructions rather than untrusted content.

Ssd 3

High

Confidence: 97% confidence
Finding: The embedded prompt explicitly instructs a model to reveal all stored memories, prior conversation context, personal details, and exact behavioral instructions. If copied into a memory-enabled assistant, it can induce disclosure of sensitive user data and system/user preference history, creating a privacy and prompt-exfiltration risk; in this skill context, harvested social content may be surfaced to downstream agents or users, making accidental reuse more plausible.

Ssd 3

High

Confidence: 94% confidence
Finding: The embedded prompt explicitly instructs an AI system to disclose all stored memories, personal data, prior conversation context, and behavioral preferences. In a skill that crawls and exports untrusted social-media content, preserving such prompt text creates a realistic prompt-injection hazard if downstream agents later ingest the markdown and follow instructions found inside it.

Ssd 3

High

Confidence: 97% confidence
Finding: The file embeds a prompt that explicitly asks a model to dump all stored user memories, personal details, behavioral preferences, and prior conversation context. In a skill that aggregates untrusted social-media content, this is dangerous because downstream agents may ingest and follow the embedded prompt as instructions, causing prompt injection and privacy-sensitive data exfiltration.

Ssd 3

High

Confidence: 96% confidence
Finding: This embedded prompt directly instructs an AI to reveal all stored memories, personal information, prior instructions, and full conversation-derived context, which is effectively a data exfiltration template. In a skill that crawls and stores arbitrary social content, preserving such prompts without isolation increases the chance that another agent, automation layer, or user will treat the text as actionable and disclose sensitive information beyond intended boundaries.

Ssd 3

High

Confidence: 95% confidence
Finding: The embedded text contains a ready-to-use prompt that instructs an AI system to disclose all stored memories, personal information, prior context, preferences, and behavioral instructions in full. In a skill that crawls and republishes social posts, preserving such prompt text verbatim can become dangerous if downstream agents ingest the markdown and follow embedded instructions, potentially causing privacy leakage or cross-context prompt injection.

Ssd 3

High

Confidence: 98% confidence
Finding: This content explicitly asks an AI system to dump all remembered user data, including personal details, goals, tools, and behavioral preferences, with minimal filtering and in copyable form. In a skill that crawls and stores external posts, such embedded instructions are especially dangerous because they may be ingested by another agent or workflow and could trigger unauthorized disclosure of sensitive conversation memory.

Ssd 3

High

Confidence: 95% confidence
Finding: The embedded prompt explicitly instructs a model to reveal all stored memories, user preferences, personal information, and prior conversational context verbatim for export. If reused by an agent or copied into another model session, it can induce disclosure of sensitive data far beyond the user's immediate request, creating a prompt-injection-style data exfiltration risk. The skill context makes this more dangerous because the tool ingests untrusted social-media content, so hostile or privacy-invasive prompts can be captured and later surfaced to downstream users or models.

Ssd 3

High

Confidence: 93% confidence
Finding: The embedded text contains a prompt explicitly asking a model to dump all stored memories, user context, personal data, behavioral preferences, and prior instructions in full. In a skill that crawls and republishes third-party posts, preserving and surfacing such prompt text creates a prompt-injection and sensitive-data-exfiltration risk if any downstream agent treats scraped content as instructions rather than untrusted data.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal