ReceiptExtract - OCR, Photo/PDF to CSV

v1.0.1

Extract structured transaction data from image or PDF receipts using the ReceiptExtract API (https://www.receiptextract.com). Use when the user wants merchan...

0· 103·0 current·0 all-time
byYura Borunov@yborunov

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for yborunov/receiptextract.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "ReceiptExtract - OCR, Photo/PDF to CSV" (yborunov/receiptextract) from ClawHub.
Skill page: https://clawhub.ai/yborunov/receiptextract
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install receiptextract

ClawHub CLI

Package manager switcher

npx clawhub@latest install receiptextract
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
medium confidence
Purpose & Capability
The name/description match the included helper script and the documented API. One inconsistency: the registry metadata at top lists 'Required env vars: none', but both SKILL.md and scripts/extract_receipt.py require RECEIPTEXTRACT_API_TOKEN — the token is necessary and expected for this purpose.
Instruction Scope
SKILL.md instructions are narrowly focused: identify files, read API token from env, POST each file to the documented endpoint, and present JSON/CSV/summary output. Instructions explicitly warn not to paste tokens into chat and to avoid committing secrets. The skill does not instruct reading unrelated files or other environment variables.
Install Mechanism
No install spec; included helper is a plain Python script. No downloads from arbitrary URLs, no package installs, and no extract/write-to-disk installer steps — low install risk.
Credentials
The single required secret (RECEIPTEXTRACT_API_TOKEN) is proportional to the service integration. However, the registry metadata failing to declare the required env var is an incoherence that could confuse users. Also, note that providing the token and running the script causes potentially sensitive receipt contents to be transmitted to the third-party API — this is expected but privacy-relevant.
Persistence & Privilege
The skill does not request persistent or elevated privileges (always:false, no config paths, no modification of other skills). It runs as a helper script invoked by the agent; autonomous invocation defaults are unchanged.
Assessment
This skill appears to do what it says: it uploads receipt files to receiptextract.com using an API token and returns parsed data. Before installing or using it: 1) fix the metadata mismatch—treat RECEIPTEXTRACT_API_TOKEN as required and store it in a secrets manager or environment variable (do not paste it into chat). 2) Be aware you are sending potentially sensitive financial/PII data to a third-party service—review ReceiptExtract's privacy/retention policy and test with non-sensitive receipts first. 3) Inspect scripts/extract_receipt.py locally (you already have it) and run on sample files to confirm behavior and error handling. 4) Verify cost/credits and handle failures (402/429/500) as described. If you need the skill to run offline or keep data local, this implementation is not appropriate because it uploads files externally.

Like a lobster shell, security has layers — review code before you run it.

latestvk97e5sf70qx897t3m00dhfy1yn842yfm
103downloads
0stars
2versions
Updated 3w ago
v1.0.1
MIT-0

Receipts

Extract transaction data from receipt images or PDFs with ReceiptExtract.

Keep the workflow simple: locate the API token, upload one receipt file (or a directory for bulk mode), inspect the JSON, then present either raw JSON or a cleaned summary. Prefer the bundled helper script for repeatable usage.

Quick workflow

  1. Identify the input file

    • Accept common image formats (.jpg, .jpeg, .png, .webp) and PDFs.
    • If the file came from chat, use the attached local path.
  2. Locate the API token

    • Set RECEIPTEXTRACT_API_TOKEN in your environment before running commands.
    • Do not paste the token back into chat.
  3. Call the upload endpoint

    • Endpoint: POST https://www.receiptextract.com/api/receipt/upload
    • Auth header: Authorization: Bearer <token>
    • Multipart form field: file
  4. Parse the response

    • Success shape typically includes:
      • success
      • data.merchant
      • data.date
      • data.items[]
      • data.tax
      • data.total
      • data.correctnessCheck
      • data.taxBreakdown[]
      • creditInfo
      • savedReceiptId
  5. Present the result

    • For humans: summarize merchant, date, items, tax, total, and any anomalies.
    • For integrations: return raw JSON or convert to CSV.

Preferred command

Use the helper script:

export RECEIPTEXTRACT_API_TOKEN="your-token"
python3 scripts/extract_receipt.py /path/to/receipt.png

Optional flags:

python3 scripts/extract_receipt.py /path/to/receipt.pdf --format summary
python3 scripts/extract_receipt.py /path/to/receipt.jpg --format csv
python3 scripts/extract_receipt.py --input-dir /path/to/receipts --format summary
python3 scripts/extract_receipt.py --input-dir /path/to/receipts --recursive --format json

Bulk processing

Use --input-dir to process multiple receipts in one run. The helper script sends one API request per file and continues even if some files fail.

  • Supported file types: .jpg, .jpeg, .png, .webp, .pdf
  • Use --recursive to include nested folders
  • Exit code is non-zero when one or more files fail
  • Each receipt consumes credits independently

Fallback command

Use curl when the helper script is unnecessary:

curl -sS -X POST "https://www.receiptextract.com/api/receipt/upload" \
  -H "Authorization: Bearer $RECEIPTEXTRACT_API_TOKEN" \
  -F "file=@/path/to/receipt.png"

Output handling

JSON

Prefer JSON when the user wants the full extracted payload or when another tool will consume the result. In bulk mode, JSON includes processed, succeeded, failed, and per-file results.

Summary

In bulk mode, summary prints one status line per file followed by total counts.

Use a concise format like:

Merchant: Walmart
Date: 2023-06-09
Total: 76.37
Tax: 8.18
Items:
- BEDDING — 39.97
- STEAMER — 27.97

CSV

When the user asks for CSV, output line-item rows with these columns when available:

  • source_file (bulk mode)
  • merchant
  • date
  • description
  • quantity
  • total_price
  • item_tax
  • sku
  • receipt_tax
  • receipt_total
  • saved_receipt_id
  • http_status (bulk mode)
  • success (bulk mode)
  • error (bulk mode)

Error handling

Interpret common failures like this:

  • 400 — invalid input, missing file, unsupported type, or file too large
  • 401 — missing/invalid token
  • 402 — insufficient credits
  • 429 — rate limited; retry with backoff
  • 500 — server error; safe to retry carefully

If the response is malformed or success is false:

  • show the error plainly
  • do not invent extracted data
  • mention likely causes if obvious (bad token, no credits, unsupported file)

Practical notes

  • Treat the API result as the source of truth, but sanity-check obvious issues.
  • Flag suspicious output instead of silently "fixing" it.
    • Example: Canadian receipt with tax currency labeled USD.
  • correctnessCheck: true is a useful confidence signal, not a guarantee.
  • Preserve the original file path and savedReceiptId when useful for traceability.
  • In bulk mode, keep one request per file and preserve each source file path for traceability.

Security

  • Keep the token out of chat replies.
  • Prefer environment variables or secret managers over embedding tokens in scripts.
  • Do not commit tokens, raw headers, or secret-bearing examples into git.

Resources

  • Helper script: scripts/extract_receipt.py
  • API reference notes: references/api.md

Comments

Loading comments...