Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

Contract Diff

v1.0.0

Compare contract templates with scanned stamped contracts, list all differences (additions, deletions, modifications). Output as Word document for easy downl...

0· 84·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for russell-yu/contract-diff.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "Contract Diff" (russell-yu/contract-diff) from ClawHub.
Skill page: https://clawhub.ai/russell-yu/contract-diff
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install contract-diff

ClawHub CLI

Package manager switcher

npx clawhub@latest install contract-diff
Security Scan
VirusTotalVirusTotal
Suspicious
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
Name/description ask for template vs scanned-contract comparison with OCR and a Word report; the included scripts use python-docx, PyMuPDF, Pillow, pytesseract and difflib which are exactly the tools you would expect for this task.
!
Instruction Scope
SKILL.md stays on-purpose (text extraction, OCR, diff, highlighted images, Word report). However there are inconsistencies: list_files.py contains a hard-coded absolute path (C:\Users\yangy\.openclaw\workspace\contract-diff\input) and performs shutil.copy to 'template.docx'/'scanned.pdf' (can overwrite files). SKILL.md states '脱敏处理: 敏感信息用 *** 代替' (redaction), but I found no implementation of systematic redaction in the scripts — reports in the output folder contain full contract text. Also the scripts run pip installs at runtime (see install_mechanism), which expands the runtime scope beyond what the SKILL.md describes.
!
Install Mechanism
No formal install spec is provided, but compare.py includes a try_import helper that calls os.system('pip install <pkg> -q') to install missing Python packages at runtime. That means installing packages from PyPI when the script runs (network activity, arbitrary package install side-effects). This is riskier than an instruction-only skill that expects preinstalled dependencies. The script does not download code from arbitrary URLs, but auto-installing packages without user confirmation is a notable concern.
Credentials
The skill declares no required environment variables or credentials, which is proportional. It does attempt to set pytesseract.pytesseract.tesseract_cmd to a Windows path if present (TESSERACT_PATH = 'C:\Program Files\Tesseract-OCR\tesseract.exe') — that is reasonable but platform-specific. It also requires a system-level Tesseract binary (documented in SKILL.md). No secrets or unrelated credentials are requested.
Persistence & Privilege
The skill is not always-enabled and does not request elevated privileges. It does write/copy files in an 'input' directory (and could overwrite files via shutil.copy in list_files.py). It does not modify other skills or system-wide configurations. Running the scripts will modify local files (create report.docx, highlighted images, and the script's own copied files).
What to consider before installing
The skill appears to implement the advertised functionality (OCR + diff + Word report) but has several things you should consider before running it: - Inspect and/or remove list_files.py or any hard-coded paths. list_files.py references an absolute Windows path (C:\Users\yangy\...) and copies files: it can overwrite files if run in your environment. - The compare script will attempt to auto-install Python packages using pip (os.system('pip install ...')). If you run it, it will perform network installs from PyPI. Run in a controlled environment (virtualenv/container) or manually install the listed dependencies instead. - The SKILL.md claims sensitive-data redaction ("脱敏处理"), but the included scripts do not perform automated redaction; reports in the package include full contract text. Do not use this on real sensitive contracts until you confirm/implement redaction. - The scripts require the Tesseract OCR binary; install it from an official source and verify PATH configuration. - Because the skill writes files and can install packages, run it in a sandbox or isolated environment and back up any data you care about first. If you want to proceed safely: review/clean the code (remove or fix list_files.py), pre-install dependencies in an isolated venv, validate that reports redact sensitive fields if needed, and test on non-sensitive sample documents. If you want me to, I can point to exact lines to change/remove or produce a safer invocation plan (commands to run in a virtualenv).

Like a lobster shell, security has layers — review code before you run it.

latestvk971wh8s3cr95a3amg2v4dbfxd849mak
84downloads
0stars
1versions
Updated 3w ago
v1.0.0
MIT-0

contract-diff

Compare contract templates (Word/PDF) with scanned stamped contracts (PDF/images), list ALL differences, and generate a highlighted visualization showing where changes are.

When to Use

  • User uploads a contract template AND a scanned signed contract
  • User wants to know EVERY difference between template and signed version
  • User needs detailed report showing additions, deletions, and modifications
  • User needs visual highlighting of modified areas in the scanned contract

Workflow

Step 1: Extract Text from Both Files

For contract template (.docx): Use python-docx to extract all text.

For contract template (.pdf): Use PyMuPDF (fitz) to extract text.

For scanned contract (PDF or image): Use OCR with pytesseract to extract text with bounding boxes.

Step 2: Detailed Comparison

Split text into sentences/paragraphs and categorize:

  1. Only in template - Content that was deleted
  2. Only in scanned - Content that was added
  3. Similar but different - Modified content (with similarity ratio)

Using difflib.SequenceMatcher with threshold:

  • 85% similarity: treated as same

  • 50-85% similarity: marked as modified
  • < 50% similarity: marked as added/deleted

Step 3: Generate Highlighted Image

For modified content:

  • Find text position in OCR results
  • Draw colored highlight box:
    • 🟡 Yellow = Modified content

Step 4: Generate Detailed Report

Output format:

# 合同比对详细报告

## 📋 文件信息
- **模板文件**: [filename]
- **盖章合同**: [filename]

## 📊 比对结果总览
- **风险等级**: 🟢低/🟡中/🔴高
- 🔴 删除内容: X 处
- 🟢 新增内容: X 处
- 🟡 修改内容: X 处

## 🔴 删除内容(模板 → 盖章合同)
1. [content...]
2. [content...]

## 🟢 新增内容(模板 → 盖章合同)
1. [content...]
2. [content...]

## 🟡 修改内容对比
| 模板内容 | 扫描件内容 | 相似度 |
|----------|------------|--------|
| ... | ... | 0.xx |

---
*⚠️ 注:比对结果基于 OCR 文字识别,可能存在误差。*

Usage

# 安装依赖
pip install python-docx PyMuPDF pillow pytesseract

# 运行比对(输出 Word 文档)
python scripts/compare.py contract_template.docx signed_contract.pdf

# 指定输出文件
python scripts/compare.py template.pdf scan.pdf -o report.docx

Dependencies

Required Python packages:

  • python-docx - for .docx files
  • PyMuPDF (fitz) - for PDF text extraction
  • Pillow - image processing
  • pytesseract - OCR
  • Tesseract-OCR binary (system-level installation required)

Important Notes

  1. OCR 准确性: 扫描件 OCR 可能存在误差,特别是手写或模糊文字
  2. 高亮精度: 高亮依赖于 OCR 识别的坐标,可能有轻微偏移
  3. 详细比对: 新版算法会列出所有差异,包括新增、删除、修改
  4. 脱敏处理: 敏感信息用 *** 代替

Output Files

文件说明
report.docxWord 文档格式的详细比对报告(含所有差异,可直接下载)
highlighted.png带高亮标注的图片(可选)

Windows Setup

  1. Install Python 3.12+
  2. Install Tesseract OCR: winget install tesseract-ocr.tesseract
  3. Install Python packages:
    pip install python-docx PyMuPDF pillow pytesseract
    

Example

# Compare two contract files, output as Word document
python compare.py "合同模板.docx" "盖章合同.pdf" -o "详细比对报告.docx"

Output includes:

  • All content only in template (deletions)
  • All content only in scanned (additions)
  • All similar but modified content with similarity scores

Comments

Loading comments...