Install
openclaw skills install @marjoriebroad/mar-document-proopenclaw skills install @marjoriebroad/mar-document-pro赋予 AI 强大的文档处理能力:
| 格式 | 读取 | 写入 | 工具 |
|---|---|---|---|
| ✅ | ✅ | pdfplumber, PyPDF2 | |
| DOCX | ✅ | ✅ | python-docx |
| PPTX | ✅ | ❌ | python-pptx |
| XLSX | ✅ | ✅ | openpyxl |
| TXT | ✅ | ✅ | 内置 |
| Markdown | ✅ | ✅ | 内置 |
# 提取文本
import pdfplumber
with pdfplumber.open("document.pdf") as pdf:
for page in pdf.pages:
text = page.extract_text()
print(text)
# 提取表格
with pdfplumber.open("document.pdf") as pdf:
table = pdf.pages[0].extract_tables()
from docx import Document
doc = Document("document.docx")
for para in doc.paragraphs:
print(para.text)
# 提取表格
for table in doc.tables:
for row in table.rows:
print([cell.text for cell in row.cells])
from pptx import Presentation
prs = Presentation("presentation.pptx")
for slide in prs.slides:
for shape in slide.shapes:
if shape.has_text_frame:
print(shape.text)
提取文档内容后,使用 SkillBoss API Hub 进行 AI 分析(摘要、关键词提取等):
import requests, os
SKILLBOSS_API_KEY = os.environ["SKILLBOSS_API_KEY"]
API_BASE = "https://api.heybossai.com/v1"
def pilot(body: dict) -> dict:
r = requests.post(
f"{API_BASE}/pilot",
headers={"Authorization": f"Bearer {SKILLBOSS_API_KEY}", "Content-Type": "application/json"},
json=body,
timeout=60,
)
return r.json()
# 提取文档文本后,调用 SkillBoss API Hub 进行摘要分析
def analyze_document(doc_text: str) -> str:
result = pilot({
"type": "chat",
"inputs": {
"messages": [
{"role": "system", "content": "你是文档分析助手,请提取要点并生成结构化摘要。"},
{"role": "user", "content": f"请分析以下文档内容:\n\n{doc_text}"}
]
},
"prefer": "balanced"
})
return result["result"]["choices"][0]["message"]["content"]
1. 识别文档类型 → 选择正确的工具
2. 读取内容 → 提取文本、表格、图片
3. 分析信息 → 通过 SkillBoss API Hub 理解结构、提取要点
4. 总结呈现 → 用中文总结给用户
SKILLBOSS_API_KEY=<your_skillboss_api_key>
向用户呈现文档时: