Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

智能文档处理Skill

v1.0.1

基于DeepSeek v4的智能文档处理,支持多格式解析、信息提取、内容分析和格式转换,效率提升20倍,准确率99%。

0· 62·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for yezhaowang888-stack/huimai-smart-document-processing.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "智能文档处理Skill" (yezhaowang888-stack/huimai-smart-document-processing) from ClawHub.
Skill page: https://clawhub.ai/yezhaowang888-stack/huimai-smart-document-processing
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install huimai-smart-document-processing

ClawHub CLI

Package manager switcher

npx clawhub@latest install huimai-smart-document-processing
Security Scan
Capability signals
Requires sensitive credentials
These labels describe what authority the skill may exercise. They are separate from suspicious or malicious moderation verdicts.
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Suspicious
medium confidence
!
Purpose & Capability
The name/description and SKILL.md repeatedly claim DeepSeek v4 integration, AI service providers, OCR/NLP capabilities, and 99% accuracy. However the included index.js is a simplified local stub that returns hard-coded/sample text, simulated analysis, and no calls to DeepSeek/OpenAI/Anthropic or any OCR libraries. package.json lists no dependencies while SKILL.md lists libraries (pdf-parse, mammoth, xlsx, natural) that are not present. This discrepancy indicates the implementation does not match the stated purpose.
!
Instruction Scope
SKILL.md instructs creating a config file containing an aiService.apiKey and shows usage that assumes real parsing and AI-backed analysis. The runtime instructions encourage cloning an external repository (gitee) and installing via npm/clawhub, but the included code will not use provided API keys or perform real parsing. The instructions could lead users to supply secrets (API keys) and expect networked processing even though the shipped code is local/stubbed.
Install Mechanism
There is no formal install spec in the skill metadata; SKILL.md suggests installing via clawhub or npm and also points to an external gitee repo. The shipped package is self-contained and has no dependencies; cloning an external repository is an external action that could pull different code. Downloading or cloning the external repo (gitee) is an explicit step in the docs and is a higher-risk action than using the included files, because it may fetch different/updated code not present in this package.
!
Credentials
The skill metadata declares no required env vars or primary credential, but SKILL.md expects an aiService.apiKey in the config for providers like openai|deepseek|anthropic. Requesting API keys in config is reasonable for an AI-backed processor, but the package does not declare or use env vars and the code does not consume keys — this mismatch can mislead users into supplying secrets to a skill that currently won't use them. Also SKILL.md lists dependencies not present in package.json, suggesting missing or incomplete dependency/credential handling.
Persistence & Privilege
The skill does not request elevated platform privileges. Flags show always: false and user-invocable: true. There is no install script or behavior that modifies other skills or system-wide configuration in the included files.
What to consider before installing
This package appears to be a marketing/placeholder implementation: the code included is a local stub that returns sample text and simulated analysis, but the README claims DeepSeek v4 integration, OCR, and 99% accuracy. Before installing or providing any API keys: (1) do not paste production API keys or secrets into the config until you confirm the code actually uses them; (2) ask the publisher for the authoritative source/repo or a link to the DeepSeek integration code; (3) if you plan to clone the external gitee repo, audit that repository's code and any install scripts before running them (look for network calls, downloads, or exec); (4) prefer packages with a clear homepage, verifiable repo, and declared dependencies; and (5) if you need real document parsing, test the package on non-sensitive sample documents to confirm behavior and data flows.

Like a lobster shell, security has layers — review code before you run it.

aivk976jhynpg2as8gqfn1ctmqjgs85f1xbdocumentsvk976jhynpg2as8gqfn1ctmqjgs85f1xbhuimaivk976jhynpg2as8gqfn1ctmqjgs85f1xblatestvk976jhynpg2as8gqfn1ctmqjgs85f1xb
62downloads
0stars
2versions
Updated 4d ago
v1.0.1
MIT-0

智能文档处理Skill

🚀 概述

基于惠迈智能体三层架构的文档处理框架,提供多格式文档解析和信息提取的基础能力,开发者可基于此框架扩展AI分析功能。

🌟 核心亮点

  • 惠迈文档协作实践:将惠迈三层智能体架构应用于文档处理流程
  • 多格式全能支持:PDF、Word、Excel、PPT等全格式覆盖
  • 可扩展设计:预留AI服务接口,支持接入文档理解、智能分析等能力
  • 三层架构保障:解析智能体、分析智能体、输出智能体协同工作

🏆 用户价值

  • 处理效率提升20倍:自动化处理文档解析、信息提取等复杂任务
  • 多格式全能支持:PDF、Word、Excel、PPT等全格式覆盖
  • 灵活扩展:支持接入外部AI服务增强文档分析能力
  • 三层架构保障:解析智能体、分析智能体、输出智能体协同工作

功能特性

  • 文档解析:支持PDF、Word、Excel、PPT、TXT等格式
  • 信息提取:自动提取关键信息、实体识别、数据抽取
  • 内容分析:文本分析、情感分析、关键词提取
  • 格式转换:文档格式互转、标准化处理
  • 智能处理:自动摘要、分类、标签生成(需接入AI服务)
  • 批量处理:支持批量文档处理

安装

# 通过ClawHub安装
clawhub install huimai-smart-document-processing

# 或手动安装
npm install smart-document-processing

配置

创建配置文件 config/smart-document-processing.json

{
  "supportedFormats": ["pdf", "docx", "xlsx", "pptx", "txt", "md"],
  "processing": {
    "extractText": true,
    "extractTables": true,
    "extractImages": true,
    "detectLanguage": true,
    "summarize": true
  },
  "output": {
    "format": "json",
    "encoding": "utf-8",
    "prettyPrint": true
  },
  "aiService": {
    "provider": "openai|deepseek|anthropic",
    "apiKey": "YOUR_API_KEY_HERE"
  }
}

使用方法

基本处理

const SmartDocumentProcessing = require('smart-document-processing');

const processor = new SmartDocumentProcessing({
  supportedFormats: ['pdf', 'docx', 'txt']
});

// 处理文档
const result = await processor.processDocument('document.pdf', {
  extractText: true,
  extractTables: true,
  summarize: true
});

文档解析

// 解析PDF文档
const pdfResult = await processor.parsePDF('document.pdf', {
  extractPages: [1, 2, 3],
  extractMetadata: true
});

// 解析Word文档
const wordResult = await processor.parseWord('document.docx', {
  extractStyles: true,
  extractComments: true
});

// 解析Excel文档
const excelResult = await processor.parseExcel('data.xlsx', {
  sheetNames: ['Sheet1', 'Sheet2'],
  includeFormulas: false
});

信息提取

// 提取关键信息
const extractedInfo = await processor.extractInformation('contract.pdf', {
  entities: ['dates', 'names', 'amounts', 'companies'],
  patterns: ['合同编号', '签订日期', '有效期']
});

// 提取表格数据
const tables = await processor.extractTables('report.docx', {
  format: 'json',
  includeHeaders: true
});

// 提取图片
const images = await processor.extractImages('presentation.pptx', {
  format: 'base64',
  quality: 80
});

内容分析

// 文本分析
const analysis = await processor.analyzeText('document.txt', {
  language: 'auto',
  sentiment: true,
  keywords: true,
  entities: true
});

// 自动摘要
const summary = await processor.summarize('long_document.pdf', {
  length: 'medium', // short, medium, long
  algorithm: 'extractive' // extractive, abstractive
});

// 文档分类
const classification = await processor.classify('document.docx', {
  categories: ['contract', 'report', 'proposal', 'manual']
});

格式转换

// PDF转Word
await processor.convertFormat('document.pdf', 'docx', {
  preserveLayout: true,
  includeImages: true
});

// Word转PDF
await processor.convertFormat('document.docx', 'pdf', {
  quality: 'high',
  security: {
    password: 'optional',
    permissions: ['print', 'copy']
  }
});

// 批量转换
await processor.batchConvert(['doc1.pdf', 'doc2.docx'], 'txt', {
  outputDir: './converted',
  overwrite: true
});

在OpenClaw中使用

@agent 解析这个PDF文档
@agent 提取合同中的关键信息
@agent 为这篇文档生成摘要
@agent 将Word文档转换为PDF
@agent 分析文档的情感倾向

API参考

构造函数

new SmartDocumentProcessing(config)

参数:

  • config.supportedFormats (array): 支持的文档格式
  • config.processing (object): 处理配置
  • config.output (object): 输出配置
  • config.aiService (object, 可选): AI服务配置(用于智能分析)

核心方法

processDocument(filePath, options)

处理文档,根据选项执行多种处理任务。

parsePDF(filePath, options)

解析PDF文档。

parseWord(filePath, options)

解析Word文档。

parseExcel(filePath, options)

解析Excel文档。

extractInformation(filePath, options)

从文档中提取关键信息。

extractTables(filePath, options)

提取表格数据。

analyzeText(filePath, options)

分析文本内容。

summarize(filePath, options)

生成文档摘要。

classify(filePath, options)

文档分类。

convertFormat(inputPath, outputFormat, options)

转换文档格式。

支持格式

输入格式

  • PDF (.pdf)
  • Word (.docx, .doc)
  • Excel (.xlsx, .xls)
  • PowerPoint (.pptx, .ppt)
  • 纯文本 (.txt, .md)
  • HTML (.html, .htm)
  • 图片 (.png, .jpg, .jpeg)

输出格式

  • JSON
  • XML
  • CSV
  • Markdown
  • 纯文本
  • HTML

处理能力

文本处理

  • 字符编码检测和转换
  • 语言检测
  • 文本清理和标准化
  • 段落和句子分割

信息提取

  • 命名实体识别
  • 日期、时间提取
  • 数字、金额提取
  • 联系方式提取
  • 地址提取

内容分析

  • 情感分析
  • 关键词提取
  • 主题建模
  • 可读性分析
  • 抄袭检测

格式处理

  • 文档合并
  • 页面分割
  • 水印添加
  • 加密解密
  • 压缩解压

依赖项

  • pdf-parse: ^1.1.1
  • mammoth: ^1.6.0
  • xlsx: ^0.18.0
  • natural: ^6.0.0

开发

# 克隆仓库
git clone https://gitee.com/yezhaowang888/huimai-skills.git

# 安装依赖
npm install

# 运行测试
npm test

# 启动开发服务器
npm run dev

贡献

欢迎提交Issue和Pull Request。

许可证

MIT License

版本历史

  • v1.0.1 (2026-04-24): 修正描述,强调框架定位
  • v1.0.0 (2026-04-22): 初始发布,基础文档处理功能

支持

如有问题,请提交Issue或联系维护团队。

Comments

Loading comments...