word-document-organizer

v1.0.1

智能整理Word文档,自动格式化、生成目录、统一样式,支持学术/商务/极简模板

2· 872·8 current·10 all-time
byJasper.W@okgptai

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for okgptai/word-document-organizer.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "word-document-organizer" (okgptai/word-document-organizer) from ClawHub.
Skill page: https://clawhub.ai/okgptai/word-document-organizer
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install word-document-organizer

ClawHub CLI

Package manager switcher

npx clawhub@latest install word-document-organizer
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
medium confidence
Purpose & Capability
The name/description (organize Word documents, format, generate TOC, apply templates) matches the instructions: the SKILL.md describes checking a document path, backing up the file, installing python-docx, and running a Python script that edits docx contents. Required capabilities (filesystem access to the provided document path and ability to run python/pip) are consistent with the stated purpose.
Instruction Scope
Instructions stay within the documented purpose (open the provided file path, create a backup, modify the document, save). No steps reference unrelated files, secrets, or external endpoints. Two implementation issues: SKILL.md allows both .docx and .doc in checks, but the code uses python-docx which only supports .docx — .doc files are likely unsupported and will fail. Also the script will create a backup near the original file and, by default, overwrite the original unless an output_path is provided; users should be warned to provide an output_path to avoid data loss.
Install Mechanism
There is no packaged installer; the script runs pip3 install python-docx if needed. This is a common, proportionate install for the stated task. Note: pip installs execute package install-time code and may modify the Python environment; running inside a virtualenv is safer but not required by the skill.
Credentials
The skill requests no environment variables, credentials, or config paths. It only needs read/write access to the document path you supply and to be able to run python/pip — this is proportional to a document-organizing tool.
Persistence & Privilege
The skill does not request always: true and has no install artifacts beyond running pip. It does not modify other skills or system-wide agent configuration. Its runtime actions are limited to local file operations and installing a Python package if absent.
Assessment
This skill is instruction-only and edits local files: only provide paths to documents you trust and keep a separate copy. It will run a pip3 install for python-docx (which can execute code at install time) and will create a timestamped backup next to the original file but will otherwise overwrite the original unless you specify output_path — so specify an output_path or test on a copy first. Note the doc/.docx inconsistency: .doc (old binary Word) is likely unsupported by python-docx, so use .docx files. If you want extra safety, run the commands in a disposable environment (virtualenv or container) and inspect the SKILL.md before use.

Like a lobster shell, security has layers — review code before you run it.

latestvk971gjh2fj3shwxnf36b85by0n82nebx
872downloads
2stars
2versions
Updated 2w ago
v1.0.1
MIT-0

Word 文档整理助手

智能识别文档结构,自动应用标准化排版,生成专业目录,让文档整理一键完成。

触发条件

当用户提出以下请求时激活此技能:

  • "整理word文档"
  • "格式化文档"
  • "生成文档目录"
  • "统一文档样式"
  • "排版优化"
  • "修复word格式"
  • "规范文档格式"
  • "word排版"

参数定义

document_path(必需)

  • 类型: string
  • 描述: Word文档的完整路径(支持.docx格式)
  • 示例: "C:/Users/Desktop/报告.docx", "/home/user/docs/论文.docx"

operations(可选,默认: all)

  • 类型: array[string]
  • 描述: 要执行的操作列表
  • 可选值:
    • format: 格式化段落(行距、间距、字体统一)
    • toc: 生成文档目录
    • styles: 应用标准样式模板
    • cleanup: 清理空段落和冗余格式
    • all: 执行所有操作(默认)

style_template(可选,默认: academic)

  • 类型: string
  • 描述: 样式模板类型
  • 可选值:
    • academic: 学术模板(宋体、层级分明,适合论文/报告)
    • business: 商务模板(微软雅黑、现代简洁,适合商业文档)
    • minimal: 极简模板(Arial、紧凑排版,适合笔记/草稿)
    • default: 默认模板(通用设置)

output_path(可选)

  • 类型: string
  • 描述: 输出文件路径(默认覆盖原文件,建议指定新路径保留原文件)
  • 示例: "C:/Users/Desktop/报告_整理版.docx"

执行流程

步骤1: 环境检查与备份

#!/bin/bash
# 检查文件存在性
if [ ! -f "${document_path}" ]; then
    echo "错误:文件不存在 ${document_path}"
    echo "请检查路径是否正确,或文件是否被移动/删除"
    exit 1
fi

# 检查文件扩展名
if [[ ! "${document_path}" =~ \.(docx|doc)$ ]]; then
    echo "错误:仅支持 .docx 或 .doc 格式"
    exit 1
fi

# 创建时间戳备份
backup_path="${document_path}.backup.$(date +%Y%m%d_%H%M%S)"
cp "${document_path}" "${backup_path}"
echo "已创建备份: ${backup_path}"

步骤2: 安装Python依赖

#!/bin/bash
# 检查python-docx是否已安装
python3 -c "import docx" 2>/dev/null || pip3 install python-docx -q
echo "依赖检查完成"

步骤3: 执行文档整理(核心Python脚本)

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

from docx import Document
from docx.shared import Pt, Inches, RGBColor
from docx.enum.text import WD_ALIGN_PARAGRAPH
from docx.oxml.ns import qn
import sys
import re
import os

# 获取参数
doc_path = "${document_path}"
output_path = "${output_path}" if "${output_path}" else doc_path
template = "${style_template}"
operations = [op.strip() for op in "${operations}".split(",")] if "${operations}" else ["all"]

print(f"处理文件: {doc_path}")
print(f"执行操作: {operations}")
print(f"使用模板: {template}")

# 加载文档
try:
    doc = Document(doc_path)
except Exception as e:
    print(f"无法打开文档: {e}")
    print("提示:请确保文档未被Microsoft Word占用(关闭Word后重试)")
    sys.exit(1)

# 模板配置
TEMPLATES = {
    "academic": {
        "h1_size": 18, "h2_size": 16, "h3_size": 14, "body_size": 12,
        "h1_font": "黑体", "h2_font": "黑体", "body_font": "宋体",
        "line_spacing": 1.5, "space_after": 6
    },
    "business": {
        "h1_size": 16, "h2_size": 14, "h3_size": 12, "body_size": 11,
        "h1_font": "微软雅黑", "h2_font": "微软雅黑", "body_font": "微软雅黑",
        "line_spacing": 1.5, "space_after": 6
    },
    "minimal": {
        "h1_size": 14, "h2_size": 12, "h3_size": 11, "body_size": 10.5,
        "h1_font": "Arial", "h2_font": "Arial", "body_font": "Arial",
        "line_spacing": 1.15, "space_after": 3
    },
    "default": {
        "h1_size": 16, "h2_size": 14, "h3_size": 12, "body_size": 12,
        "h1_font": "宋体", "h2_font": "宋体", "body_font": "宋体",
        "line_spacing": 1.5, "space_after": 6
    }
}

config = TEMPLATES.get(template, TEMPLATES["academic"])
changes_log = []

# 操作1: 格式化
if "format" in operations or "all" in operations:
    print("正在格式化文档...")
    count = 0
    for para in doc.paragraphs:
        para.paragraph_format.line_spacing = config["line_spacing"]
        para.paragraph_format.space_after = Pt(config["space_after"])
        para.paragraph_format.space_before = Pt(0)

        if not para.style.name.startswith('Heading'):
            for run in para.runs:
                run.font.name = config["body_font"]
                run._element.rPr.rFonts.set(qn('w:eastAsia'), config["body_font"])
                run.font.size = Pt(config["body_size"])
        count += 1

    changes_log.append(f"格式化 {count} 个段落")

# 操作2: 应用样式
if "styles" in operations or "all" in operations:
    print("正在应用样式模板...")
    title_count = 0

    for para in doc.paragraphs:
        text = para.text.strip()
        if not text:
            continue

        if re.match(r'^[第][一二三四五六七八九十\d]+[章\s]|^[\d]+\s*[、..\s]|^[((][一二三四五六七八九十]+[))]', text):
            para.style = doc.styles['Heading 1']
            for run in para.runs:
                run.font.name = config["h1_font"]
                run._element.rPr.rFonts.set(qn('w:eastAsia'), config["h1_font"])
                run.font.size = Pt(config["h1_size"])
                run.font.bold = True
                run.font.color.rgb = RGBColor(0, 0, 0)
            title_count += 1

        elif re.match(r'^\d+\.\d+[\s.、]|^[((][\d一二三四五六七八九十]+[))]', text):
            para.style = doc.styles['Heading 2']
            for run in para.runs:
                run.font.name = config["h2_font"]
                run._element.rPr.rFonts.set(qn('w:eastAsia'), config["h2_font"])
                run.font.size = Pt(config["h2_size"])
                run.font.bold = True
            title_count += 1

        elif re.match(r'^\d+\.\d+\.\d+|^[((]\d+[))]', text):
            para.style = doc.styles['Heading 3']
            for run in para.runs:
                run.font.size = Pt(config["h3_size"])
                run.font.bold = True
            title_count += 1

    changes_log.append(f"识别并格式化 {title_count} 个标题")

# 操作3: 生成目录
if "toc" in operations or "all" in operations:
    print("正在生成目录...")

    toc_entries = []
    for para in doc.paragraphs:
        if para.style.name.startswith('Heading'):
            level = int(para.style.name[-1]) if para.style.name[-1].isdigit() else 1
            toc_entries.append((level, para.text.strip()))

    if toc_entries:
        first_para = doc.paragraphs[0]
        toc_title = first_para.insert_paragraph_before("目录")
        toc_title.alignment = WD_ALIGN_PARAGRAPH.CENTER
        for run in toc_title.runs:
            run.font.size = Pt(config["h1_size"])
            run.font.bold = True
            run.font.name = config["h1_font"]
            run._element.rPr.rFonts.set(qn('w:eastAsia'), config["h1_font"])

        for level, text in toc_entries[:50]:
            indent = "    " * (level - 1)
            entry_para = first_para.insert_paragraph_before(f"{indent}{text}")
            entry_para.paragraph_format.left_indent = Inches(0.3 * (level - 1))
            for run in entry_para.runs:
                run.font.size = Pt(config["body_size"])

        separator = first_para.insert_paragraph_before("—" * 30)
        separator.alignment = WD_ALIGN_PARAGRAPH.CENTER

        changes_log.append(f"生成目录,包含 {len(toc_entries)} 个条目")
    else:
        changes_log.append("未检测到标题结构,跳过目录生成")

# 操作4: 清理
if "cleanup" in operations or "all" in operations:
    print("正在清理冗余内容...")
    removed = 0
    prev_empty = False

    for i in range(len(doc.paragraphs) - 1, -1, -1):
        para = doc.paragraphs[i]
        is_empty = not para.text.strip()

        if is_empty and prev_empty:
            p_element = para._element
            p_element.getparent().remove(p_element)
            removed += 1
        prev_empty = is_empty

    changes_log.append(f"删除 {removed} 个冗余空段落")

# 保存文档
try:
    doc.save(output_path)
    print(f"文档已保存: {output_path}")
except Exception as e:
    print(f"保存失败: {e}")
    sys.exit(1)

print("
" + "="*50)
print("整理报告")
print("="*50)
for log in changes_log:
    print(log)
print("="*50)
print("文档整理完成!")

步骤4: 验证输出

#!/bin/bash
if [ -f "${output_path}" ]; then
    file_size=$(ls -lh "${output_path}" | awk '{print $5}')
    echo "输出文件: ${output_path} (${file_size})"
    echo "提示:原文件已备份,如有问题可恢复"
else
    echo "错误:输出文件未生成"
    exit 1
fi

使用示例

示例1: 全面整理(推荐)

整理文档 C:/Users/Desktop/毕业论文.docx,使用学术模板,执行所有操作,输出到 C:/Users/Desktop/毕业论文_整理版.docx

示例2: 仅格式化和生成目录

格式化 C:/Docs/报告.docx,操作包括format,toc,模板用business

错误处理

错误场景处理方式
文件不存在提示检查路径,退出码1
文件格式不支持提示仅支持docx/doc,退出码1
文件被Word占用提示关闭Word后重试,退出码1
权限不足提示以管理员身份运行,退出码1

安全说明

  • 本地执行: 所有操作在本地完成,不上传任何文件到云端
  • 自动备份: 每次执行自动创建带时间戳的备份
  • 原子操作: 失败时不会损坏原文件(先处理再保存)
  • 权限最小化: 仅需文件读写权限,无需网络或系统管理权限

Comments

Loading comments...