Security audit

Visible Text Extractor

Security checks across malware telemetry and agentic risk

Overview

This appears to be a real text/OCR extraction skill, but it can fetch arbitrary web content and optionally send generated documents to Feishu with limited safeguards.

Review this skill before installing if you work with private pages or sensitive documents. Use it only on URLs and files you are authorized to process, avoid internal or credentialed pages unless isolated, review generated outputs before sharing, and use the Feishu send option only when you explicitly intend to transfer the DOCX outside the local environment.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Behavioral ASTexec() Call, eval() Call, Dynamic Import
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (12)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: return {'ok': False, 'text': '', 'error': 'high ocr unavailable'} out = Path(tempfile.mktemp(prefix='two-pass-high-', suffix='.json')) try: proc = subprocess.run( [str(HIGH_OCR_PY), str(HIGH_OCR), str(image_path), '--output', str(out)], capture_output=True, text=True, timeout=90 )
Confidence: 78% confidence
Finding: proc = subprocess.run( [str(HIGH_OCR_PY), str(HIGH_OCR), str(image_path), '--output', str(out)], capture_output=True, text=True, timeout=90 )

subprocess module call

Medium

Category: Dangerous Code Execution
Content: return '' out = Path(tempfile.mktemp(prefix='ordered-clean-high-', suffix='.json')) try: proc = subprocess.run( [str(HIGH_OCR_PY), str(HIGH_OCR), str(image_path), '--output', str(out)], capture_output=True, text=True, timeout=90 )
Confidence: 84% confidence
Finding: proc = subprocess.run( [str(HIGH_OCR_PY), str(HIGH_OCR), str(image_path), '--output', str(out)], capture_output=True, text=True, timeout=90 )

Lp3

Medium

Category: MCP Least Privilege
Confidence: 94% confidence
Finding: The skill documentation advertises and operationalizes network access, shell execution, file reads/writes, browser automation, and environment-dependent behavior, but it does not declare permissions or boundaries for those capabilities. That creates a trust and governance gap: an agent may invoke broad side effects such as fetching arbitrary URLs, writing deliverables, invoking ffmpeg/playwright/python scripts, or using external tooling without explicit user-visible authorization controls.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 88% confidence
Finding: The skill's stated purpose sounds like passive text extraction, but the documented behavior expands into browser-driven capture, remote image downloading, DOCX generation, and possible transmission of generated files to Feishu. This mismatch is security-relevant because users or orchestration systems may grant it under a narrower trust model than its real capabilities, enabling unexpected data exfiltration, broader content collection, and side effects on local or remote systems.

Context-Inappropriate Capability

Medium

Confidence: 93% confidence
Finding: The script includes a Feishu file-sending integration that is outside the core stated purpose of visible text extraction and document reconstruction. This creates an outbound exfiltration path for captured webpage content, screenshots, and reconstructed documents, which is especially sensitive because the skill handles user-supplied URLs and generated artifacts.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: Exposing --send-feishu-receive-id lets the caller direct generated documents to an arbitrary recipient, extending the skill from local extraction into recipient-targeted external delivery. That increases the risk of intentional or accidental data exfiltration because the content may include scraped page text, screenshots, and OCR output from potentially sensitive sources.

Description-Behavior Mismatch

Medium

Confidence: 94% confidence
Finding: At this point the script transitions from content extraction and document generation into external delivery by automatically invoking the Feishu sender when a receive ID is provided. In context, this makes the skill more dangerous because it processes and packages harvested content, then immediately provides a built-in channel to export it.

Context-Inappropriate Capability

Medium

Confidence: 89% confidence
Finding: The script can optionally send the generated DOCX to a Feishu recipient, introducing outbound data transfer capability beyond local text extraction. Because the document contains extracted page text and OCR from downloaded images, misuse or accidental invocation could exfiltrate potentially sensitive content to an arbitrary receive-id.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The specification explicitly instructs the skill to fetch/render pages and download every discovered image, but it does not require user consent, scope restrictions, or warnings about external network access. In an agent setting, this can cause unexpected requests to third-party hosts, leak sensitive target URLs or internal resources, and expand data collection beyond what the user reasonably expects.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The DOCX can be transmitted to Feishu with no user-facing warning in this script that an outbound transfer is about to occur. Because the generated file may contain captured webpage text, OCR from images, and screenshots, silent transfer can cause privacy, confidentiality, or compliance violations.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The script fetches arbitrary user-supplied URLs and then downloads image URLs parsed from the retrieved HTML, which can expose the host running the skill to server-side request forgery and unintended network access. In this skill context, that is more dangerous because the tool is explicitly designed to retrieve remote webpages and embedded media, so attacker-controlled content can cause follow-on requests to internal services or sensitive endpoints without clear restrictions.

Missing User Warnings

Medium

Confidence: 85% confidence
Finding: The script fetches arbitrary remote image URLs from input JSON and writes the downloaded content to disk without any allowlist, size checks, or network-target restrictions. This creates an SSRF-style risk in agent environments: an attacker who controls the input can cause the agent to make outbound requests to internal services or download large/unexpected content, potentially exposing internal network reachability or consuming resources. In this skill context, the risk is elevated because processing untrusted webpage/article/image inputs is the core feature, so attacker-controlled URLs are expected rather than exceptional.

VirusTotal

61/61 vendors flagged this skill as clean.

View on VirusTotal