Install
openclaw skills install merge-reimbursement-pdfsAutomatically merge reimbursement folders into one PDF. Supports recursive folder scanning, PDF merging, two invoices per A4 page, phone screenshots/images placed as A4 half-page slots, and Excel workbooks with multiple sheets converted to PDF before merging.
openclaw skills install merge-reimbursement-pdfs这个 skill 用来自动整理报销材料文件夹,把 PDF、图片和 Excel 合并成一个适合提交报销或打印归档的 PDF。
功能 1:递归扫描文件夹并合并材料
功能 2:发票两张合并到一张 A4
功能 3:手机截图和图片按 A4 半页排版
功能 4:Excel 多 sheet 转 PDF 后合并
Use the bundled script to merge a reimbursement folder into one checked PDF:
航旅纵横 / 行程校验单 PDFs are appended after A4/full-page materials and before A5 invoice merged sheetsRun the script from this skill. Excel conversion requires soffice/LibreOffice on the machine:
python3 scripts/merge_travel_pdfs.py "/path/to/folder" \
--output "/path/to/folder/merged-reimbursement.pdf" \
--report "/path/to/folder/merge-report.json" \
--render-check "/path/to/folder/合并检查缩略图" \
--image-rotate 90 \
--overwrite
The script is designed for macOS and Windows, with Linux best-effort support.
pip when missing: PyMuPDF, Pillow, openpyxl.soffice.brew install --cask libreofficewinget install TheDocumentFoundation.LibreOffice first, then Chocolatey if availableapt-get, dnf, or yum when available--no-auto-install or environment variable MERGE_TRAVEL_PDFS_NO_AUTO_INSTALL=1 to disable automatic installation.If no supported package manager exists, install LibreOffice manually and rerun.
python3 scripts/merge_travel_pdfs.py "/path/to/folder" --no-auto-install
To create one output per immediate child folder:
python3 scripts/merge_travel_pdfs.py "/path/to/parent-folder" \
--split-subfolders \
--image-rotate 90 \
--overwrite
find or rg --files and confirm output files will not be included as inputs.--dry-run if classification looks risky.--report and --render-check.items list for each file's classification reason.--force-invoice "relative/path/or/name-fragment"
--force-normal "relative/path/or/name-fragment"
Both override flags can be repeated.
The script treats a PDF as an invoice when the extracted text and filename pass a weighted keyword score. Strong signals include 发票号码, 发票代码, 开票日期, 价税合计, 购买方, 销售方, 纳税人识别号, 增值税, 电子发票, 全电发票, 数电票, 校验码, and 国家税务总局.
Do not rely only on the word 报销; itinerary sheets and reimbursement summaries often contain it but are not invoices. Images are included as normal A5 attachments unless manually overridden. Excel files are always treated as normal supporting material and converted to PDF.
Always check the script summary:
expected_output_pages must equal actual_output_pagesnormal_pdf_pages + a4_invoice_pages + spreadsheet_pages + hanglv_pages + normal_image_sheets + invoice_sheets should match the output page countwarnings should be empty or understoodIf verification fails, do not deliver the PDF as complete. Fix the cause, rerun, and re-check.