百度文档解析pipeline-parser

Security checks across malware telemetry and agentic risk

Overview

This skill appears to be a straightforward Baidu document-parsing integration, with expected third-party document upload risks but no hidden or destructive behavior found.

Install only if you are comfortable sending the documents you choose to Baidu’s document parsing service. Avoid using it for confidential, regulated, or secret material unless your policy allows Baidu processing; use a dedicated limited-quota API key; keep credentials out of source control; and treat returned markdown_url and parse_result_url links as private for their 30-day lifetime.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • Taint TrackingDirect Taint Flow, Variable-Mediated Taint Flow, Credential Exfiltration Chain
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Findings (4)

Tainted flow: 'parse_result_url' from requests.post (line 183, network input) → requests.get (network output)

Medium
Category
Data Flow
Content
if download_result:
                        parse_result_url = result.get('result', {}).get('parse_result_url')
                        if parse_result_url:
                            parse_response = requests.get(parse_result_url)
                            parse_response.encoding = 'utf-8'
                            result['parse_result'] = parse_response.json()
                    return result
Confidence
95% confidence
Finding
parse_response = requests.get(parse_result_url)

Vague Triggers

Medium
Confidence
83% confidence
Finding
The trigger words and usage descriptions are broad enough that the skill may activate for generic requests about documents, OCR, text extraction, or analysis. Over-broad triggering can cause unintended invocation and accidental transmission of sensitive files or document contents to the external service.

Missing User Warnings

High
Confidence
98% confidence
Finding
The skill documentation describes document parsing features but does not clearly warn users that uploaded document contents, extracted text, tables, images, and metadata will be sent to a third-party Baidu API and may be stored behind result URLs for up to 30 days. This creates a substantial privacy and data-governance risk, especially for sensitive, regulated, or confidential documents.

Missing User Warnings

Medium
Confidence
92% confidence
Finding
The documentation explicitly instructs users to send full document contents to Baidu's external parsing API via either base64 upload or remote URL, but it does not warn about privacy, confidentiality, or data-handling implications. In a document-processing skill, users may upload sensitive contracts, IDs, internal reports, or regulated data, so omitting a clear disclosure meaningfully increases the risk of unintended third-party data exposure.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal