Expense Categorization

Extract and categorize expenses from receipts or statements, map to GL codes, check compliance with policies, and flag anomalies for review.

samledger67-dotcom@samledger67-dotcom

Install

openclaw skills install @samledger67-dotcom/expense-categorization

Expense Categorization

Receipt OCR, GL mapping, policy compliance, and anomaly detection for business expenses.

Workflow

1. Receipt Extraction (OCR)

Use tesseract (local) or Vision API for image receipts; pdfplumber for PDF receipts.

Key fields to extract:

Vendor name, date, total amount, line items
Payment method (last 4 digits if visible)
Tax amount (HST/GST/sales tax)
Tips/gratuity (separate from subtotal)

bash

# Tesseract OCR on receipt image
tesseract receipt.jpg stdout --psm 4 | python3 scripts/parse_receipt.py

# Or use Claude vision directly for complex layouts

For complex or handwritten receipts → use vision model with prompt in references/ocr-prompt.md.

2. GL Code Mapping

Map extracted expense category to chart of accounts. See references/gl-mapping.md for:

Standard QBO GL codes for common expense types
IRS-aligned categories (meals 50%, travel, home office, etc.)
Crypto/DeFi expense categories

Matching logic:

Exact vendor name match (known vendor list)
MCC code match (credit card transactions)
Keyword match on description/line items
Fallback: prompt user to select category

3. Policy Compliance Check

Apply policy rules before approval routing. See references/policy-rules.md for standard rules.

Core checks:

Per diem limits: Meals >$75 require itemized receipt; travel per diem by city
Receipt threshold: Receipt required for any expense ≥$25 (IRS standard)
Time limit: Receipts must be submitted within 30/60/90 days (configurable)
Duplicate detection: Same vendor + amount ± $1 within 7 days = flag
Split transactions: Same vendor, sequential dates, amounts just below approval threshold = flag

4. Anomaly Detection

Flag for human review:

Amount > 2× historical average for that vendor/category
Weekend or holiday transactions (especially travel/entertainment)
Round-number amounts (potential personal purchase)
Vendor in restricted list (casinos, adult entertainment, competitors)
Missing required fields (date, vendor, business purpose)
Out-of-state purchases for office supply categories

5. Output Format

json

{
  "receipt_id": "REC-20260315-001",
  "vendor": "Delta Air Lines",
  "date": "2026-03-15",
  "amount": 487.50,
  "currency": "USD",
  "gl_code": "6200",
  "category": "Travel - Air",
  "policy_status": "approved",
  "flags": [],
  "confidence": 0.94,
  "requires_review": false,
  "notes": "Business purpose required for reimbursement"
}

Batch Processing

For expense report batches:

python

# Process folder of receipts
import glob
receipts = glob.glob("receipts/*.{jpg,png,pdf}")
results = [categorize(r) for r in receipts]

# Summary stats
flagged = [r for r in results if r["requires_review"]]
total = sum(r["amount"] for r in results)
by_category = group_by(results, "category")

Output batch summary as CSV or feed directly to QBO via qbo-automation skill.

Common Patterns

Credit card statement import:

Parse CSV/OFX from bank
Match known vendors → auto-categorize
Unknown vendors → ML classification or prompt
Export mapped transactions to QBO

Expense report approval routing:

Auto-approve: policy-compliant, under $250, no flags
Manager approval: $250–$2,500 or single flag
Finance review: >$2,500, multiple flags, or restricted category

Mileage reimbursement:

Extract start/end locations + business purpose
Calculate at current IRS rate (check references/irs-rates.md)
Map to GL 6210 (Auto/Mileage)

Integration Points

qbo-automation: Push categorized transactions directly to QBO
crypto-tax-agent: Route DeFi/crypto expenses for cost basis tracking
kpi-alert-system: Trigger alerts when department spend exceeds budget
invoice-automation: Cross-reference receipts with vendor invoices

Negative Boundaries

Not for PTIN-backed tax work — categorization ≠ tax advice; defer to licensed preparer
Not for payroll — employee expense reimbursement != payroll processing
Not a real-time feed — batch review with human sign-off before posting to GL
Not for legal contracts — use contract-review-agent for vendor agreements
Confidence <0.7 → always route to human review, never auto-post

Expense Categorization

Install

Expense Categorization

Workflow

1. Receipt Extraction (OCR)

2. GL Code Mapping

3. Policy Compliance Check

4. Anomaly Detection

5. Output Format

Batch Processing

Common Patterns

Integration Points

Negative Boundaries

Related skills