百度文档解析pipeline-parser

Security checks across malware telemetry and agentic risk

Overview

This skill appears to be a straightforward Baidu document-parsing integration, with expected third-party document upload risks but no hidden or destructive behavior found.

Install only if you are comfortable sending the documents you choose to Baidu’s document parsing service. Avoid using it for confidential, regulated, or secret material unless your policy allows Baidu processing; use a dedicated limited-quota API key; keep credentials out of source control; and treat returned markdown_url and parse_result_url links as private for their 30-day lifetime.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Taint TrackingDirect Taint Flow, Variable-Mediated Taint Flow, Credential Exfiltration Chain
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (4)

Tainted flow: 'parse_result_url' from requests.post (line 183, network input) → requests.get (network output)

Medium

Category: Data Flow
Content: if download_result: parse_result_url = result.get('result', {}).get('parse_result_url') if parse_result_url: parse_response = requests.get(parse_result_url) parse_response.encoding = 'utf-8' result['parse_result'] = parse_response.json() return result
Confidence: 95% confidence
Finding: parse_response = requests.get(parse_result_url)

Vague Triggers

Medium

Confidence: 83% confidence
Finding: The trigger words and usage descriptions are broad enough that the skill may activate for generic requests about documents, OCR, text extraction, or analysis. Over-broad triggering can cause unintended invocation and accidental transmission of sensitive files or document contents to the external service.

Missing User Warnings

High

Confidence: 98% confidence
Finding: The skill documentation describes document parsing features but does not clearly warn users that uploaded document contents, extracted text, tables, images, and metadata will be sent to a third-party Baidu API and may be stored behind result URLs for up to 30 days. This creates a substantial privacy and data-governance risk, especially for sensitive, regulated, or confidential documents.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The documentation explicitly instructs users to send full document contents to Baidu's external parsing API via either base64 upload or remote URL, but it does not warn about privacy, confidentiality, or data-handling implications. In a document-processing skill, users may upload sensitive contracts, IDs, internal reports, or regulated data, so omitting a clear disclosure meaningfully increases the risk of unintended third-party data exposure.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal