docx-to-md
v1.0.0将Word文档(.docx)转换为Markdown格式并提取图片。使用此技能当用户需要:(1)将Word文档转换为Markdown格式,(2)从Word文档中提取图片,(3)同时完成文档格式转换和图片提取任务。
Security Scan
OpenClaw
Benign
high confidencePurpose & Capability
The SKILL.md and the included scripts/docx_to_md.py implement the described functionality: extracting media from the .docx ZIP (word/media/) and converting document content to Markdown using python-docx. There are no unrelated credentials, binaries, or external services requested.
Instruction Scope
Runtime instructions only reference the input .docx path and an output directory. The code reads the provided .docx, extracts files under 'word/media/' into the output directory and writes a Markdown file — behavior consistent with the documented scope. The SKILL.md does not instruct reading other system files or transmitting data externally.
Install Mechanism
There is no install spec; this is instruction/code-only. The only dependency is python-docx (documented in SKILL.md with a pip install command). No remote download URLs or archive extraction from arbitrary hosts are used.
Credentials
No environment variables, credentials, or config paths are required. The skill operates on user-supplied file paths only, which is proportionate to its purpose.
Persistence & Privilege
always is false and the skill does not request persistent global privileges or modify other skills/configs. It writes output files only into the specified output directory.
Assessment
This skill appears coherent and safe for its stated purpose. Before installing/using: (1) ensure you have python 3.7+ and install python-docx (pip install python-docx); (2) run it on .docx files you trust or in a controlled output directory (it will create an <basename>_output folder or use the provided output path and write image_*.png/jpg files and a .md file); (3) be aware it writes files to disk and may overwrite files in the chosen output directory — pick a safe location. If you want extra assurance, inspect scripts/docx_to_md.py yourself; it contains no network calls or credential handling.Like a lobster shell, security has layers — review code before you run it.
latest
docx-to-md
将Word文档(.docx)转换为Markdown格式,并提取文档中的图片到指定目录。
使用方法
运行脚本进行转换:
import sys
sys.path.insert(0, '<skill目录>/scripts')
from docx_to_md import docx_to_md
docx_to_md('输入文件.docx', '输出目录')
或在命令行运行(需手动处理参数转义):
python <skill路径>/scripts/docx_to_md.py "文件.docx"
参数
input_file: Word文档路径(.docx)output_dir: 输出目录(可选,默认创建同名_output文件夹)
输出
*.md: 转换后的Markdown文件image_*.png/jpg/gif: 提取的图片文件
转换规则
| Word格式 | Markdown |
|---|---|
| 标题1 | # 标题 |
| 标题2 | ## 标题 |
| 标题3 | ### 标题 |
| 标题4 | #### 标题 |
| 无序列表 | - 内容 |
| 有序列表 | 1. 内容 |
| 表格 | Markdown表格 |
| 图片 |
依赖
- Python 3.7+
- python-docx
pip install python-docx
Comments
Loading comments...
