X Article Extract

Security checks across malware telemetry and agentic risk

Overview

This is a disclosed X/Twitter content extractor with expected but sensitive use of X session cookies, Firecrawl, and optional content-library ingestion.

Install this only if you are comfortable letting it use your xreach/X login session to retrieve X content, send external t.co targets to Firecrawl when an API key is configured, and save extracted content into your content factory when you explicitly use --ingest. Use a dedicated or low-privilege X account for sensitive workflows.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Behavioral ASTexec() Call, eval() Call, Dynamic Import
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration

Findings (13)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: } try: proc = subprocess.run( ["python3", str(bitable_script), "create", "--table", "MaterialInbox", "--fields", json.dumps(fields, ensure_ascii=False)],
Confidence: 82% confidence
Finding: proc = subprocess.run( ["python3", str(bitable_script), "create", "--table", "MaterialInbox", "--fields", json.dumps(fields, ensure_ascii=False)],

Lp3

Medium

Category: MCP Least Privilege
Confidence: 96% confidence
Finding: The skill declares no permissions while its documented behavior clearly requires shell execution, network access, and environment-variable use. This matters because users and calling systems cannot accurately assess what the skill will access, increasing the risk of unexpected outbound requests, secret use, or command execution without informed approval.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 94% confidence
Finding: The documented purpose is content extraction, but the skill also performs materially different actions: reading a local authenticated session file and optionally writing extracted data into a content-management system. These extra behaviors expand the trust boundary from passive retrieval to credential use and persistent data modification, which can surprise users and enable unintended data disclosure or unauthorized writes.

Description-Behavior Mismatch

Medium

Confidence: 94% confidence
Finding: The skill advertises extraction, but it also contains optional logic to write extracted data into an external material repository. That hidden expansion of scope increases privacy and integrity risk because users may invoke a content-extraction tool without expecting persistence into another system.

Context-Inappropriate Capability

High

Confidence: 96% confidence
Finding: The code reaches into another workspace skill and executes its script directly, creating an unjustified trust bridge between components. In a multi-skill environment, this can be abused for unauthorized data propagation, surprising side effects, or execution of a replaced/tampered script at that filesystem path.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The README instructs users to authenticate via `xreach auth extract --cookie-source chrome` and later states that X Article extraction uses Playwright plus the xreach auth cookie, but it does not clearly warn that the skill will use authenticated X session cookies to access and scrape content. This matters because users may unknowingly grant the skill access under their logged-in identity, increasing privacy, account misuse, and policy/compliance risk if the cookie is mishandled or the automation behaves unexpectedly.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The README says external links shared via `t.co` are processed using the Firecrawl API, but it does not clearly warn that resolved third-party URLs and their contents may be sent to an external service for extraction. That omission can cause unintentional disclosure of sensitive URLs, internal resources, or user browsing targets to a third party, especially when agents process links automatically.

Vague Triggers

Medium

Confidence: 85% confidence
Finding: The trigger conditions are broad enough to match ordinary requests about X links, which increases the chance the skill runs automatically when the user did not intend network retrieval, browser automation, or use of stored authenticated sessions. In this context, over-triggering is more dangerous because the skill may access third-party content, use cookies, and potentially perform downstream ingestion.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The skill advertises an --ingest mode that writes extracted content into a content repository, but the documentation does not prominently warn that this is a persistent data-modifying action. Without clear warning and confirmation, a user may invoke it expecting read-only extraction and instead cause unintended storage, duplication, or disclosure of content into shared systems.

Missing User Warnings

Low

Confidence: 84% confidence
Finding: The skill fetches external webpages via resolved t.co links and uses the Firecrawl API, but the description does not clearly warn that URLs and page content may be transmitted to external services. This is a transparency and privacy issue: users may not realize third-party requests and data transfer occur beyond X itself.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The script silently reads local X session credentials from disk and uses them for authenticated browser scraping. In a skill context, that is dangerous because it leverages preexisting user secrets without a clear prompt, enabling access to content the caller may not realize is being fetched under their account.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: When resolving a t.co link to an external page, the tool sends the target URL to Firecrawl for scraping without prominently warning the user. In practice, this discloses browsing targets and may transmit private, internal, or sensitive links to a third-party service outside the local environment.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: Ingest mode persists extracted content into an external material store, but the code provides no strong user-facing warning about retention, downstream access, or permanence. This is risky because tweets, article text, and scraped external content may include sensitive or copyrighted material that users expected only to inspect locally.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal