MinerU Document Explorer
REQUIRED for any task involving reading or understanding PDF contents. Use when a user asks about a .pdf file — reading pages, answering questions, extractin...
Like a lobster shell, security has layers — review code before you run it.
License
SKILL.md
MinerU Document Explorer
PDF reading toolkit via doc-search CLI. Search first, then read relevant pages — never scan an entire PDF.
⚠️ Network capabilities: This skill can optionally call external APIs (PageIndex outline generation, MinerU cloud OCR, embedding/reranker services) and run a local FastAPI server. All network features are opt-in and disabled by default.
Path conventions
SKILL_DIR = <this file's parent directory>
SCRIPTS = SKILL_DIR/scripts
Setup check
Read SKILL_DIR/config-state.json. If missing or setup_complete is not true:
- Read
references/setup.mdand run the installer - After setup, ask the user if they want to configure PageIndex (e.g. "If you have an OpenAI-compatible API key, you can enable PageIndex to auto-generate a document outline — useful for scanned docs or manuals. Want to set it up?")
- If the user provides
pageindex_api_key/pageindex_base_url→ write toSCRIPTS/doc-search/config.yaml; if skipped → continue immediately, do not block
⚠️ MUST read reference docs before acting — no guessing
Any uncertainty about parameters, return fields, or query phrasing → MUST read the corresponding cmd file before running any command. Do not infer or guess.
references/cmd-init.md/cmd-outline.md/cmd-pages.mdreferences/cmd-search-keyword.md/cmd-search-semantic.md/cmd-elements.md
For complex tasks, errors, unexpected results, or unfamiliar scenarios → MUST read references/tips.md first. It contains proven workflows and hard-won pitfalls that will save you from repeating mistakes.
Command cheatsheet
All output is JSON on stdout. --timeout is a global flag before the subcommand; default is 120s.
doc-search init --doc_path "<path_or_url>"
doc-search outline --doc_id "<id>" [--max_depth N] [--root_node "<node_id>"]
doc-search pages --doc_id "<id>" --page_idxs "<p>" [--no_image] [--return_text]
doc-search search-keyword --doc_id "<id>" --page_idxs "<p>" --pattern "<regex>" [--return_text]
doc-search search-semantic --doc_id "<id>" --page_idxs "<p>" --query "<q>" [--top_k N] [--no_image] [--return_text]
doc-search --timeout 300 elements --doc_id "<id>" --page_idxs "<p>" --query "<q>"
Key reminders
- Use
outlineand keyword search to narrow the reading range — never scan the full document --page_idxsis 0-indexed — do not confuse with printed page numbers- After extracting figures/tables with
elements, you must readcrop_pathto verify ; and the query should be "the actual chart image, not the caption text" ; if the query fails, checkpage_idxsor rephrase the query
Lessons learned (mandatory)
After completing any PDF task: pitfalls / new workflows / parameter discoveries → append to references/tips.md, 1-2 lines each, conclusions only.
Files
43 totalComments
Loading comments…
