AutoDimension Report Skill En

Automation

Process PDF, DOCX, XLSX from supply chain document packages — conversion, image extraction, OCR, dimension verification, and review report generation. Invoked when users mention part dimension inspection, dimension report review, supplier document review, seal/signature check, or image extraction.

Install

openclaw skills install auto-dimension-report-skill-en

Part Dimension Inspection

A general-purpose Agent Skill for supply chain document review, dimension inspection report verification, and seal/signature checks. Core pipeline: Document sorting -> Image extraction -> OCR recognition -> Verification -> Review report output

TL;DR

Quickly select execution path based on user goal:

User wants	Minimum steps	Main outputs
Sort document package, extract images	Step 1	`output/`, `image/`, `_ImageIndex.xlsx`
Check seals, signatures, scanned text	Step 1 -> Step 2	`imagetomd/`, OCR keyword clues
Verify dimension correctness	Step 1 -> Step 3	Judgement consistency check results
Generate full review conclusion	Step 1, optionally Step 2/3, then Step 4	`ReviewReport.md`

Quick judgment:

Seeing PDF / DOCX / XLSX document package -> prioritize Step 1
Seeing seal / signature / scanned document / OCR -> add Step 2
Seeing OK / NG / Pass / Fail / tolerance -> add Step 3
Seeing summary report / final review conclusion -> execute Step 4 last

What Problem Does This Skill Solve

When a user provides a document package containing PDF, DOCX, XLSX files, this skill organizes the scattered files into a reviewable structure and provides directly usable intermediate results and reports for manual review.

Common suitable tasks:

Supplier document review, quality document review, supply chain file verification
Dimension inspection reports, full dimension reports, measurement reports, APQP inspection checklist review
Consistency check between actual measured values and judgement in Excel/Word/PDF
Image extraction, scanned document OCR, seal/stamp/signature keyword check
Convert raw material into a three-layer structure (output/, image/, imagetomd/) for traceability

When to Invoke

Use this skill when the user shows the following intent:

Explicitly mentions "part dimension inspection", "dimension inspection report", "full dimension report", "measurement report"
Wants to review a supplier-submitted PDF/Word/Excel document package
Wants to batch extract images, seal pages, or scanned pages from documents
Wants to run OCR on images or scanned documents
Wants to check if "OK / NG / Pass / Fail" judgements are consistent with actual measurements
Wants to generate a summary review report

Typical trigger phrases:

dimension inspection report
part dimension inspection
supply chain review
supplier document review
quality document review
table data review
judgement consistency check
seal review
stamp review
signature review
PDF to DOCX
extract images

Quick Decision

If unsure how to start, follow this order:

Does the user provide a task folder or document package directory?
Does the package contain .pdf, .docx, .xlsx, .xlsm files?
Is the user's goal more toward "document sorting", "OCR recognition", "judgement verification", or "report generation"?
Does the current environment have Python dependencies and usable OCR capability?

Decision rules:

Has document package directory but user's goal is unclear -> first ask if OCR, judgement verification, and final report are needed
Single document only -> can still process, but warn user that completeness may be limited
No directory path -> ask user for a valid path first; do not fabricate execution results
No OCR environment -> can still execute Step 1; when seal/scanned document review is involved, clearly note capability limitations

Cases Not Suitable for Direct Processing

Do not promise completion in these cases; explain boundaries first:

User wants CAD / 3D model geometry analysis — this skill does not handle CAD native structures
User only provides screenshot fragments but expects full cross-file traceability — need the original document package first
Current environment has no Python dependencies or OCR capability — can only do document sorting, not full OCR review
User wants to fully replace manual seal authenticity verification — this skill can only assist with text clues and position, not legal authenticity determination

Input & Output

Input

A task folder
Folder may contain .pdf, .docx, .xlsx, .xlsm
For OCR, the environment needs an accessible Herdsman OCR capability

Output

After task completion, the following structure is typically created:

Task Folder/
├── Original files
├── output/
│   ├── *.pdf.docx
│   ├── *.docx
│   ├── *.xlsx
│   └── _ImageIndex.xlsx
├── image/
│   └── Extracted images saved in subdirectories by source file
├── imagetomd/
│   └── OCR Markdown generated per image
└── ReviewReport.md

Core Capabilities

1. Document Conversion & Sorting

Convert PDF to reviewable DOCX
Preserve text, tables, images and their relative order
Copy DOCX / XLSX / XLSM directly to output/ for traceability

2. Dual Image Indexing

When converting PDF to DOCX, image positions retain two types of information:

Method	Purpose
Embedded image	For manual review directly in DOCX
Path reference `[Image Reference] image/...`	For agent or manual backtracking to original image

3. Image Extraction

Extract embedded images from PDF, DOCX, XLSX/XLSM and save to image/{source file name}/.

File Type	Image Naming Convention
PDF	`source.pdf-p{page}-img{seq}.{ext}`
DOCX	`source.docx-{seq}.{ext}`
XLSX/XLSM	`source.xlsx-{seq}.{ext}`

4. OCR Text Recognition

Iterate through images in image/
Default: use Herdsman HTTP API for recognition; fall back to OCR script on failure
Write results to corresponding Markdown files under imagetomd/
Supports specifying OCR skill directory via HERDSMAN_SKILL_DIR or scripts/config.json
Supports batch processing, resume capability, per-image timing & ETA output

5. Judgement Consistency Verification

For common dimension inspection tables, the script checks:

Whether measured values fall within tolerance range
Whether the judgement column uses standard terminology
High-risk cases where values are out-of-tolerance but marked as OK / Pass

6. Review Report Generation

Aggregate data from output/, image/, imagetomd/ to generate a report for manual review, with additional OCR text scanning for:

seal / stamp
signature

Standard Workflow

Execute in the following order by default. Do not skip steps unless the user explicitly requests only partial results.

Step 0: Confirm Task Scope

First confirm the following:

Task folder path
Does the user want "document sorting only" or "full review"
Is OCR needed?
Is judgement consistency check needed?
Is a final review report needed?

If the user has not been clear, ask at least once — do not make assumptions.

Step 1: Execute Document Sorting & Image Extraction

Main entry script:

uv run python "<skill-dir>/scripts/task_convert_extract.py" --dir "<task-folder>"

This step will:

Process PDF -> DOCX
Extract images to image/
Copy preservable DOCX / XLSX / XLSM to output/
Generate output/_ImageIndex.xlsx

Step 2: If OCR is needed, process image recognition

python "<skill-dir>/scripts/image_to_markdown.py" --dir "<task-folder>"

Optional parameters:

python "<skill-dir>/scripts/image_to_markdown.py" --dir "<task-folder>" --force
python "<skill-dir>/scripts/image_to_markdown.py" --dir "<task-folder>" --model "paddleocr-ppocrv5-server"
python "<skill-dir>/scripts/image_to_markdown.py" --dir "<task-folder>" --batch-size 20

This step is mandatory only in these scenarios:

Scanned document review
Seal / stamp / signature check
Need to extract text clues from images

Step 3: If table judgement verification is needed, extract and validate data

python "<skill-dir>/scripts/extract_verify_data.py" --dir "<task-folder>"

When template column positions vary, supplement with these parameters:

python "<skill-dir>/scripts/extract_verify_data.py" --dir "<task-folder>" --seq-col 2 --item-col 3 --std-col 5 --method-col 7 --data-start 9 --data-cols 5 --judge-col 14 --data-start-row 24

Enhancements:

Auto-detect column positions from headers
Auto-identify number of data columns
Support √, ○ as pass judgement
Support tolerance formats like 43°±5°, Φ6-0.05, ≥5.4MPa, 13（+0.2/0）

Applicable to:

XLSX inspection tables
DOCX measurement reports with dimension tables
Tasks requiring tolerance vs judgement consistency checks

Step 4: Generate Review Report

python "<skill-dir>/scripts/generate_report.py" --dir "<task-folder>"

Optional output formats:

python "<skill-dir>/scripts/generate_report.py" --dir "<task-folder>" --format md
python "<skill-dir>/scripts/generate_report.py" --dir "<task-folder>" --format json
python "<skill-dir>/scripts/generate_report.py" --dir "<task-folder>" --format summary

Key output includes:

File inventory overview
Table and document content summary
Image index availability
OCR hit keyword locations
Deep checks: tight-limit, precision inconsistency, bias analysis, OK-only filling
Missing field warnings: part number, responsible person, full supplier name, vehicle model
Anomaly items requiring manual re-check

Step 5: Summarize Results to User

The final response should at least cover:

Which source files were processed
Which output directories and files were generated
Whether OCR was executed
Whether high-risk judgement inconsistencies were found
Whether seal/signature keywords were hit
Suggestions for what to manually review next

Typical User Requests

The following phrases should typically trigger this skill:

"Review this supplier document package for dimension judgement issues."
"Convert all PDFs in this task folder to reviewable versions and extract images."
"Check this batch of reports for stamp or signature clues."
"Check if the OK/NG judgements in the Excel match the measured values."
"Once done, produce a review report."

Recommended response strategy:

First confirm the task directory
Clarify whether OCR is needed
Clarify whether judgement consistency check is needed
Explain which directories and files will be output

Review Rules

Rule 1: Judgement Terminology Must Be Strictly Consistent

Standard pass terms:

OK
合格 (Pass)
√
○

Standard fail terms:

NOK
不合格 (Fail)
NG

The following are considered non-standard pass terms and require user notification:

PASS
正确 (Correct)
通过 (Pass)

Rule 2: Out-of-Tolerance but Judged Pass — High Risk

If measured values are outside tolerance range but the judgement column still reads:

OK
合格 (Pass)

It must be marked as a high-risk item with explicit warning in the summary.

Rule 3: Seal/Signature — Clue Only

When OCR hits the following keywords, only indicate "relevant text clues found" — do not draw conclusions:

印章 (seal)
公章 (official stamp)
签名 (signature)

Rule 4: Keep Original Files Traceable

Do not overwrite original files. All output should be placed in:

output/
image/
imagetomd/

OCR Configuration

Configuration file: scripts/config.json

{
  "base_url": "http://127.0.0.1:8080",
  "ocr_model": "paddleocr-ppocrv5-server",
  "ocr_transport": "auto",
  "request_timeout": 120,
  "retry_count": 2,
  "retry_delay": 5,
  "ocr_script_dir": ""
}

Priority (high to low):

CLI argument --model
Environment variables
config.json
Script defaults

Relevant environment variables:

HERDSMAN_BASE_URL: Override OCR service address
HERDSMAN_SKILL_DIR: Explicitly specify OCR skill root directory
HERDSMAN_OCR_TRANSPORT: Override OCR transport method (auto / http / script)

Failure & Fallback Handling

`image/` not found

Means the user has not executed Step 1 yet
Run task_convert_extract.py first

OCR script not found

Notify user that the current environment lacks Herdsman OCR capability
List searched directories
Have user set HERDSMAN_SKILL_DIR or update scripts/config.json

Task directory does not exist

Stop immediately
Ask user to provide a valid path

Partial functionality only

Execute on demand, no forced full workflow:

Extract images only -> Step 1
OCR only -> Step 1, then Step 2
Judgement verification only -> at least Step 1, then Step 3
Review report only -> usually recommend completing Step 1 first, optionally Step 2/3

Resource Quick Reference

Path	Purpose
`scripts/task_convert_extract.py`	Document sorting, PDF to DOCX, image extraction, index generation entry point
`scripts/image_to_markdown.py`	Image OCR recognition and Markdown output
`scripts/extract_verify_data.py`	Dimension data extraction and judgement consistency verification
`scripts/generate_report.py`	Aggregate and generate review report
`scripts/config.json`	OCR model and service configuration
`references/setup-guide.md`	Environment installation and setup instructions
`references/naming-conventions.md`	Directory structure and naming conventions

Constraints

Do not modify original file content
Do not treat OCR hit results as factual conclusions
Do not treat non-standard judgement terms as standard pass
Without OCR environment, do not fabricate OCR results
When responding, prioritize giving the user "directory location + risk items + next steps"