Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

xmind-doc-parser

v1.0.1

Parse documents in 18+ formats using Baidu API to extract text, tables, layout, OCR scanned images, and produce document chunks for RAG.

1· 104·0 current·0 all-time
Security Scan
VirusTotalVirusTotal
Pending
View report →
OpenClawOpenClaw
Suspicious
high confidence
!
Purpose & Capability
The code and SKILL.md implement a Baidu Document Parser client and this matches the skill description. However the registry metadata claims no required environment variables or config paths, while SKILL.md and references clearly require BAIDU_DOC_AI_API_KEY and BAIDU_DOC_AI_SECRET_KEY and suggest editing ~/.openclaw/openclaw.json. That mismatch is incoherent and should be corrected.
!
Instruction Scope
Runtime instructions are focused on document parsing and polling Baidu's APIs (expected). But ancillary documentation instructs editing the global OpenClaw config file (~/.openclaw/openclaw.json) and restarting the gateway to inject credentials — this references a system path outside the skill's declared scope and effectively centralizes credentials for other skills, increasing blast radius.
Install Mechanism
There is no install spec or remote download; the skill is instruction-only with an included Python script. No external archives or unknown URLs are fetched by the installer. The client uses standard requests to call Baidu endpoints (expected).
!
Credentials
The SKILL.md (and the Python client) require BAIDU_DOC_AI_API_KEY and BAIDU_DOC_AI_SECRET_KEY which are proportionate to calling Baidu's API. However the registry metadata declares 'required env vars: none' and 'required config paths: none' — a clear inconsistency. Also references encourage placing these secrets into a global OpenClaw config, which would expose them to other skills.
Persistence & Privilege
The skill does not request always:true and does not modify other skills. However references instruct the operator to place API keys into a shared ~/.openclaw/openclaw.json and restart the gateway; that is a form of persistent credential placement (administrative action) that could increase exposure if other skills or users can read that file.
What to consider before installing
What to consider before installing: - Metadata mismatch: The registry says no env vars required, but SKILL.md and the included script require BAIDU_DOC_AI_API_KEY and BAIDU_DOC_AI_SECRET_KEY. Ask the author to fix the metadata or update the registry entry before trusting the skill. - Credentials: The skill needs your Baidu API key/secret (reasonable for this purpose). Avoid placing these secrets in a global config if you care about limiting access: do not paste keys into ~/.openclaw/openclaw.json unless you accept that other skills or users with access to that file might use them. If you must store keys, restrict file permissions (e.g., chmod 600) and consider a per-skill or per-agent secret store. - Data exfil/privacy: The skill sends documents (base64 or public URLs) to Baidu's cloud endpoints. Do not send sensitive, confidential, or internal-only documents or internal URLs. If you pass file_url, be aware the remote service will fetch that URL (potentially exposing internal endpoints to Baidu). - Operational limits: The skill documents file-size, QPS and polling limits — confirm these match your expected usage and billing/quotas in your Baidu account. - Verify behavior: Review the included script (it calls only Baidu endpoints and has no obfuscated code). Test with non-sensitive sample files and a limited/ephemeral API key to confirm behavior before using real data. - Remediation suggestions: Ask the maintainer to update registry metadata to declare required env vars and any config path usage; prefer guidance for using per-skill secrets rather than editing a global openclaw.json; add explicit warnings about sending sensitive data to a third-party cloud. Given the clear metadata/instruction mismatch and the recommendation to store credentials globally, treat this skill cautiously (suspicious) rather than outright malicious, but require fixes or mitigations before trusting it with sensitive data.

Like a lobster shell, security has layers — review code before you run it.

latestvk977e7phmp70hb5wq2f1d89zms83jv6s
104downloads
1stars
2versions
Updated 3w ago
v1.0.1
MIT-0

Baidu Document Parser Skill

Parse documents using Baidu Intelligent Document Analysis Platform API.

Overview

This skill provides document parsing capabilities through Baidu's Document Parser API, supporting:

  • 18+ document formats (PDF, Word, Excel, PowerPoint, images, etc.)
  • Text extraction
  • Table recognition and extraction
  • Layout analysis (titles, paragraphs, headers/footers, etc.)
  • OCR for scanned documents
  • Document chunking for RAG applications
  • Multi-language support (Chinese, English, Japanese, Korean, French, German, etc.)

When to Use

Use this skill when users need to:

  • Parse PDF, Word, Excel, or other document formats
  • Extract text content from documents
  • Recognize and extract tables
  • Analyze document structure (titles, sections, layout)
  • Process scanned documents with OCR
  • Chunk documents for RAG applications

API Configuration

Environment Variables (Required)

Set these before using the skill:

export BAIDU_DOC_AI_API_KEY="your_api_key"
export BAIDU_DOC_AI_SECRET_KEY="your_secret_key"

Authentication

The skill uses OAuth 2.0 to obtain an access token automatically. Token is valid for 30 days.

Supported Formats

Documents: pdf, doc, docx, xls, xlsx, ppt, pptx, wps, et, dps, csv, txt, html, mhtml, ofd

Images: jpg, jpeg, png, bmp, tiff, tif

Total: 18+ formats

Supported Languages

Chinese, English, Japanese, Korean, French, German, Italian, Portuguese, Spanish, Russian, Dutch, Swedish, Finnish, Danish, Norwegian, Hungarian, Turkish, Polish, Czech, Greek, and more (20+ languages)

Usage

Basic Usage

python3 scripts/baidu_doc_parser.py --file_data <文件的base64编码> 
python3 scripts/baidu_doc_parser.py --file_url <文件数据URL> 

API Parameters

File Parameters (Required, choose one)

  • file_url (string): Document URL (publicly accessible)
  • file_data (string): Base64-encoded file data
  • file_name (string, required): File name with extension

Core Function Parameters

  • recognize_formula (bool): Recognize formulas in documents (default: false)
  • analysis_chart (bool): Parse statistical charts (default: false)
  • angle_adjust (bool): Auto-rotate images (default: false)
  • parse_image_layout (bool): Return image position info (default: false)

Language and Format Parameters

  • language_type (string): Recognition language (default: "CHN_ENG")
    • Options: CHN_ENG, JAP, KOR, FRE, SPA, POR, GER, ITA, RUS, DAN, DUT, MAL, SWE, IND, POL, ROM, TUR, GRE, HUN, THA, VIE, ARA, HIN
  • switch_digital_width (string): Convert number width (default: "auto")
    • Options: "auto" (no conversion), "half" (half-width), "full" (full-width)
  • html_table_format (bool): Return tables in HTML format (default: true)

Advanced Parameters

  • version (string): API version (default: "v2")
  • need_inner_image_data (bool): Include internal image data
  • merge_tables (bool): Merge related tables
  • relevel_titles (bool): Restructure title hierarchy
  • recognize_seal (bool): Recognize document seals/stamps
  • return_span_boxes (bool): Return span bounding boxes

Document Chunking Parameters

  • return_doc_chunks (dict): Document chunking configuration
    • switch (bool): Enable chunking (default: false)
    • split_type (string): Chunking method - "chunk" (by size) or "mark" (by punctuation)
    • separators (list): Punctuation marks for splitting (default: ['。', ';', '!', '?', ';', '!', '?'])
    • chunk_size (int): Chunk size in characters (default: -1 for auto)

Return Structure

Page Object

Each page contains:

  • page_id: Page identifier
  • page_num: Page number
  • text: All text content on the page
  • layouts: Layout elements (titles, paragraphs, tables, images, etc.)
  • tables: Extracted tables
  • images: Extracted images

Layout Types

  • title: Title (with sub_type: title_1, title_2, title_3, etc.)
  • para: Paragraph
  • table: Table
  • image: Image
  • head_tail: Header/footer
  • contents: Table of contents
  • seal: Seal/stamp
  • formula: Mathematical formula

Table Object

  • layout_id: Table identifier
  • markdown: Table content in Markdown format
  • position: Bounding box [x, y, width, height]
  • cells: Cell information
  • matrix: Cell index matrix (for merged cells)

Chunk Object

  • chunk_id: Chunk identifier
  • content: Chunk content
  • type: Chunk type ("text" or "table")
  • meta: Metadata (titles, position, page number)

API Characteristics

Asynchronous Processing

Document parsing is asynchronous:

  1. Submit request → Get task_id
  2. Poll for results using task_id

Polling Recommendations

  • Start polling 5-10 seconds after submission
  • Polling interval: 5 seconds
  • Maximum polling time: 300 seconds

QPS Limits

  • Submit request API: 2 QPS
  • Query result API: 10 QPS

File Limits

  • File size:
    • URL mode: PDF up to 300MB, others up to 50MB
    • Base64 mode: Up to 50MB
  • Page limit: Up to 2000 pages for PDF, 200 for others
  • Formats: 18+ supported formats

Error Handling

Common error codes:

CodeMessageSolution
110/111Access token invalid/expiredRe-obtain access token
216200Empty file or URLProvide file_data or file_url
216201File format errorCheck file format
216202File size errorReduce file size
282000Internal errorRetry or contact support
282003Missing parametersCheck required parameters
282007Task not existCheck task_id
282018Service busyReduce request frequency

For complete error codes, see references/error_codes.md

Scripts

The skill includes Python scripts for document parsing:

  • scripts/baidu_doc_parser.py: Main client library
  • Command-line interface for quick testing

References

  • references/api_reference.md: Complete API documentation
  • references/error_codes.md: Full error code reference

Related Links

Comments

Loading comments...