Ride Receipts

Build, refresh, export, and query a local SQLite ride-history database from Gmail ride receipt emails (Uber, Bolt, Yandex Go, Lyft) using LLM extraction from...

MIT-0 · Free to use, modify, and redistribute. No attribution required.

⭐ 0 · 191 · 0 current installs · 0 all-time installs

byMaxim Tuleyko@tuleyko

MIT-0

Security Scan

VirusTotal

Suspicious

View report →

OpenClaw

Benign

high confidence

✓

Purpose & Capability

Name/description, required binaries (gog, python3), config path for a Gmail account, and included scripts all match a pipeline that fetches Gmail messages, runs LLM extraction, and writes to SQLite. No unrelated credentials, binaries, or config paths are requested.

ℹ

Instruction Scope

SKILL.md and the included prompt templates explicitly instruct using raw email text_html as the primary source and sending it to the active LLM for extraction. The README instructs the agent to confirm user consent before sending raw HTML. The scripts access only Gmail via gog and local files in ./data; they do not read unrelated system files or unknown env vars. The sensitive nature of raw HTML (financial/location data) is acknowledged; this is within the stated scope but requires explicit user caution.

✓

Install Mechanism

This is instruction-only with included Python scripts; there is no install spec that downloads or executes remote code. Scripts are plain Python and operate locally, which is low-risk from an installation perspective.

✓

Credentials

No environment variables or external API keys are required. The single required config path (skills.entries.ride-receipts-llm.config.gmailAccount) is proportional: the tool needs to know which Gmail account to use. There are no requests for unrelated secrets or cloud credentials.

✓

Persistence & Privilege

always is false and autonomous invocation is allowed (platform default). The skill does not request permanent elevated presence or modify other skills. It writes local files in ./data and creates the SQLite DB, which is expected behavior.

Assessment

This skill is coherent for its purpose, but it processes sensitive email HTML and will send raw email HTML to whatever LLM the agent uses. Before running: 1) Confirm you are comfortable sending ride receipts (financial and location data) to the active LLM and consider using a self-hosted/private model if you need privacy. 2) Verify the 'gog' CLI is authenticated to the correct Gmail account (skills.entries.ride-receipts-llm.config.gmailAccount) so you don't accidentally expose another account. 3) Limit date ranges / --max-per-provider when first running to inspect behavior and outputs. 4) Inspect the included scripts and prompts yourself (they're small) and consider running them locally in an isolated environment. 5) If you need stronger privacy, modify the extraction step to run a local LLM or to locally sanitize/redact emails before sending. Otherwise the skill appears to do what it claims.

Like a lobster shell, security has layers — review code before you run it.

Current versionv0.1.2

Download zip

latestvk9732xn929d0f7ddad17b24pcx82qzr4

License

MIT-0

Free to use, modify, and redistribute. No attribution required.

Termshttps://spdx.org/licenses/MIT-0.html

Runtime requirements

Binsgog, python3

Configskills.entries.ride-receipts-llm.config.gmailAccount

SKILL.md

ride-receipts-llm

Run a reproducible 3-stage pipeline:

initialize/validate SQLite schema (fixed; do not edit)
fetch full receipt emails into JSONL
extract structured rides with LLM (one-shot + repair)
upsert into SQLite

Prerequisites and safety

Require gog CLI installed and authenticated for the selected Gmail account.
Prefer configured account: skills.entries.ride-receipts-llm.config.gmailAccount; if missing, ask user for account explicitly.
Ask for date scope before fetch: all-time, after YYYY-MM-DD, or between dates.
Treat receipt content as sensitive financial/location data.
Before extraction, explicitly confirm user is okay sending raw email HTML to the active LLM.
Extraction uses raw text_html from emails; do not claim local-only parsing.
Never hallucinate fields; keep unknown values null.

Paths

Schema (do not modify): skills/ride-receipts-llm/references/schema_rides.sql
Emails JSONL: ./data/ride_emails.jsonl
Extracted rides JSONL: ./data/rides_extracted.jsonl
SQLite DB: ./data/rides.sqlite

0) Initialize DB

python3 skills/ride-receipts-llm/scripts/init_db.py \
  --db ./data/rides.sqlite \
  --schema skills/ride-receipts-llm/references/schema_rides.sql

1) Fetch Gmail receipts → JSONL

python3 skills/ride-receipts-llm/scripts/fetch_emails_jsonl.py \
  --account <gmail-account> \
  --after YYYY-MM-DD \
  --before YYYY-MM-DD \
  --max-per-provider 5000 \
  --out ./data/ride_emails.jsonl

Omit --after / --before when not needed.
Output rows include provider metadata, snippet, and raw text_html.

2) LLM extraction contract

Read ./data/ride_emails.jsonl; write one JSON object per line to ./data/rides_extracted.jsonl.

Per email:

Run one-shot extraction for all fields.
Quality-gate: amount,currency,pickup,dropoff,payment_method,distance_text,duration_text,start_time_text,end_time_text.
If any are missing, run repair pass(es) for missing fields only.
Merge additively; never replace existing non-null values with null.

Schema (one line per ride):

{
  "provider": "Uber|Bolt|Yandex|Lyft",
  "source": {"gmail_message_id": "...", "email_date": "YYYY-MM-DD HH:MM", "subject": "..."},
  "ride": {
    "start_time_text": "...",
    "end_time_text": "...",
    "total_text": "...",
    "currency": "EUR|PLN|USD|BYN|RUB|UAH|null",
    "amount": 12.34,
    "pickup": "...",
    "dropoff": "...",
    "pickup_city": "...",
    "pickup_country": "...",
    "dropoff_city": "...",
    "dropoff_country": "...",
    "payment_method": "...",
    "driver": "...",
    "distance_text": "...",
    "duration_text": "...",
    "notes": "..."
  }
}

Rules:

Use text_html as primary source; fallback to snippet only if text_html is empty.
Keep addresses/time strings verbatim.
Keep amount numeric; if only textual total exists, set amount: null and preserve text in total_text.

3) Insert extracted rides → SQLite

python3 skills/ride-receipts-llm/scripts/insert_rides_sqlite_jsonl.py \
  --db ./data/rides.sqlite \
  --schema skills/ride-receipts-llm/references/schema_rides.sql \
  --rides-jsonl ./data/rides_extracted.jsonl

Schema is idempotent via UNIQUE(provider, gmail_message_id) ON CONFLICT REPLACE.

Files

6 total

Select a file

Select a file to preview.

Comments

Loading comments…