Install
openclaw skills install office-doc-extractorConvert Microsoft Office documents (DOCX, XLSX, PPTX) to Markdown without any external dependencies. Use when the user needs to extract text from Word documents, Excel spreadsheets, or PowerPoint presentations for analysis, indexing, or LLM processing. Pure Python implementation — no pip install, no subprocess calls, no network downloads required. Works offline.
openclaw skills install office-doc-extractorZero-dependency converter for Microsoft Office documents. Extracts text and structure from DOCX, XLSX, and PPTX files into clean Markdown.
# Single file
python3 scripts/main.py report.docx -o report.md
# Batch convert a directory
python3 scripts/main.py ./documents --batch -o ./markdown
| Format | Extension | Output |
|---|---|---|
| Word | .docx | Headings, paragraphs |
| Excel | .xlsx | Tables (one per sheet) |
| PowerPoint | .pptx | Slides as sections |
zipfile and xml.etreeopenpyxl (pure Python, no C extensions)No external commands, no network calls, no pip install required.
python3 scripts/main.py <input_file> [-o <output.md>]
Auto-detects format from file extension. If -o is omitted, outputs to <input>.md.
python3 scripts/main.py <input_directory> --batch [-o <output_directory>]
Converts all .docx, .xlsx, .pptx files in the directory. Results saved to markdown_output/ by default.
pdf tool for that)Existing markitdown-based skills require pip install or external CLI tools, which triggers ClawHub security warnings. This skill is 100% self-contained — install it and use it immediately, even offline.