Install
openclaw skills install tally-extractorInstance A skill for TallyPrime (Extractor). Parses invoices/bills from PDF/image via Telegram/WhatsApp, extracts structured data (party, GSTIN, items, taxes, totals), and POSTs canonical JSON to the bridge service on Instance B. Does NOT post to Tally directly — that is handled by tally-skill on Instance B.
openclaw skills install tally-extractorExtract invoice/bill data from PDFs and images received via Telegram or WhatsApp, convert to a canonical JSON payload, and POST to the bridge service on Instance B for Tally entry.
tally-skill on Instance B.reference/extraction-schema.md).Goal: zero manual entry for CAs handling many clients.
/v1/post-voucher).Scope note: This skill is for Instance A (Extractor). It does not post to Tally — that responsibility belongs to tally-skill on Instance B.
Use when:
Do NOT use this skill for:
tallyca runs on Instance B)^[0-9]{2}[A-Z]{5}[0-9]{4}[A-Z]{1}[1-9A-Z]{1}Z[0-9A-Z]{1}$. If invalid, flag it.taxable_amount × tax_rate / 100 ≈ tax_amount (within ±0.01). If mismatch, flag it.reference/extraction-schema.md state code table.YYYY-MM-DD. Common formats on invoices: DD/MM/YYYY, DD-MM-YYYY, DD.MM.YYYY, MMM DD, YYYY.{companyShort}-{voucherType}-{invoiceNumber}-{date}.User sends PDF or image via Telegram. The document is available for OCR/vision processing.
Before extraction, verify Instance B is reachable:
curl -s -X GET \
-H "Authorization: Bearer $BRIDGE_BEARER" \
"$BRIDGE_URL/v1/health"
Expected response:
{
"tally": "ok",
"company_default": "ABC Company",
"version": "1.0.0"
}
If tally: "down", inform user: "Tally is not running on the client machine. Please ask them to open TallyPrime."
Parse the document and extract:
| Field | Source | Validation |
|---|---|---|
| Party name | Invoice header | Required |
| Party GSTIN | Near party name or GST section | 15-char format |
| Company GSTIN | Near company name or letterhead | 15-char format |
| Invoice number | Invoice header | Required |
| Invoice date | Invoice header | Parse to YYYY-MM-DD |
| Items | Line items table | At least one |
| Item description | Line item | Required per item |
| HSN/SAC | Line item | 4-8 digits |
| Quantity | Line item | Numeric |
| Unit | Line item | Optional |
| Rate | Line item | Numeric |
| Tax rate | Line item or GST section | Percentage |
| CGST/SGST/IGST amounts | GST section | Numeric |
| Total | Invoice footer | Required, must balance |
| Narration | Notes section | Optional |
| Document type | Voucher type |
|---|---|
| Purchase invoice (we are buyer) | Purchase |
| Sales invoice (we are seller) | Sales |
| Payment receipt | Receipt |
| Payment voucher | Payment |
| Credit note | CreditNote |
| Debit note | DebitNote |
Clues:
If ambiguous, ask user: "Is this a purchase (you bought) or sales (you sold)?"
Run these checks before sending to bridge:
| Check | Rule | Action if fails |
|---|---|---|
| Required fields | All required fields present | List missing fields, ask user |
| GSTIN format | 15-char regex match | Flag invalid, ask user |
| HSN format | 4-8 digits | Flag invalid, ask user |
| Tax math | taxable × rate / 100 ≈ tax | Flag mismatch, ask user |
| Total balance | items + taxes = total (±1) | Flag mismatch, ask user |
| Date parseable | Valid date | Ask user for correct date |
Construct the JSON payload per reference/extraction-schema.md:
{
"schema_version": "1.0",
"request_id": "uuid-v4",
"idempotency_key": "abc-purchase-xyz-186-20260518",
"company": "ABC Company",
"voucher": {
"type": "Purchase",
"date": "2026-05-18",
"number": "186",
"is_invoice_mode": true,
"voucher_class": null,
"narration": "Against Invoice 186",
"party": {
"name": "XYZ Party",
"gstin": "27AABCU9603R1ZM",
"place_of_supply": "Maharashtra",
"registration_type": "Regular"
},
"company_gstin": "27AABCU9603R1ZN",
"items": [...],
"taxes": {...},
"total": 46199.83,
"bill_allocations": [...]
},
"source": {
"kind": "pdf",
"filename": "invoice_186.pdf",
"extracted_at": "2026-05-18T10:30:00Z"
},
"confidence": {
"overall": 0.93,
"fields": {...}
}
}
Send the JSON to Instance B:
# Compute HMAC
BODY='{"schema_version":"1.0",...}'
SIGNATURE=$(echo -n "$BODY" | openssl dgst -sha256 -hmac "$BRIDGE_HMAC_SECRET" | cut -d' ' -f2)
curl -X POST \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $BRIDGE_BEARER" \
-H "X-Signature: hmac-sha256=$SIGNATURE" \
-H "Idempotency-Key: abc-purchase-xyz-186-20260518" \
-d "$BODY" \
"$BRIDGE_URL/v1/post-voucher"
Full HTTP contract in reference/bridge.md.
{
"status": "posted",
"guid": "abc-purchase-xyz-186-20260518",
"voucher_number": "186",
"company": "ABC Company",
"summary": "Purchase voucher posted: XYZ Party, ₹46,199.83",
"masters_created": ["XYZ Party"]
}
Reply to user (see reference/prompts.md):
Entry posted to Tally.
Company: ABC Company
Type: Purchase
Party: XYZ Party
Invoice No: 186
Date: 18 May 2026
Amount: ₹46,199.83 (Taxable: ₹39,152.40 + IGST: ₹7,047.43)New ledger created: XYZ Party
{
"status": "needs_clarification",
"missing_fields": ["voucher.voucher_class"],
"message": "Please confirm the voucher class name (e.g., 'Purchase @ 18 %')."
}
Forward the question to the user.
{
"status": "error",
"error_code": "TALLY_UNREACHABLE",
"message": "Could not connect to Tally."
}
Reply to user: "Tally is not responding. Please check that TallyPrime is open and try again."
The company name must match exactly in TallyPrime. Strategies:
/v1/health returns company_defaultSome Tally companies use voucher classes for automatic GST splitting. This skill does NOT know which class to use — that's Instance B's job.
voucher_class: null and let Instance B handle itneeds_clarification for class, forward to userAssign confidence scores based on:
| Extraction quality | Confidence |
|---|---|
| Clear text, high contrast, exact match | 0.95 - 1.0 |
| Readable but some ambiguity | 0.7 - 0.94 |
| Blurry, low contrast, guessed | 0.4 - 0.69 |
| Highly uncertain | 0.0 - 0.39 |
Fields with confidence < 0.7 should be flagged for user confirmation before posting.
First 2 digits of GSTIN → Place of Supply:
| Code | State |
|---|---|
| 01 | Jammu and Kashmir |
| 02 | Himachal Pradesh |
| 03 | Punjab |
| 04 | Chandigarh |
| 05 | Uttarakhand |
| 06 | Haryana |
| 07 | Delhi |
| 08 | Rajasthan |
| 09 | Uttar Pradesh |
| 10 | Bihar |
| 11 | Sikkim |
| 12 | Arunachal Pradesh |
| 13 | Nagaland |
| 14 | Manipur |
| 15 | Mizoram |
| 16 | Tripura |
| 17 | Meghalaya |
| 18 | Assam |
| 19 | West Bengal |
| 20 | Jharkhand |
| 21 | Odisha |
| 22 | Chhattisgarh |
| 23 | Madhya Pradesh |
| 24 | Gujarat |
| 25 | Daman and Diu |
| 26 | Dadra and Nagar Haveli |
| 27 | Maharashtra |
| 28 | Andhra Pradesh (Old) |
| 29 | Karnataka |
| 30 | Goa |
| 31 | Lakshadweep |
| 32 | Kerala |
| 33 | Tamil Nadu |
| 34 | Puducherry |
| 35 | Andaman and Nicobar Islands |
| 36 | Telangana |
| 37 | Andhra Pradesh |
| 38 | Ladakh |
There is exactly one chat interface (Telegram on Instance A). Instance B has no Telegram/WhatsApp — only the bridge HTTP endpoint.
flowchart LR
User["CA / client on Telegram"] -->|"PDF / image"| TgBot["Telegram bot (Instance A)"]
subgraph A["Instance A on dev EC2"]
TgBot --> OpenClawA["OpenClaw\nLLM: Codex Plus session"]
OpenClawA --> ExtractorSkill["tally-extractor-skill"]
end
OpenClawA -->|"HTTPS POST /v1/post-voucher"| Bridge["bridge-service\n(client box)"]
subgraph B["Instance B on client CPU"]
Bridge --> OpenClawB["OpenClaw\nLLM: OpenAI API key"]
OpenClawB --> TallySkill["tally-skill"]
TallySkill --> Tally["TallyPrime\nlocalhost:9000"]
end
Bridge --> OpenClawA
OpenClawA --> User
| Environment | Instance A | Instance B | Tally access |
|---|---|---|---|
| Production | Your EC2 (Telegram, Codex Plus) | Client mini-PC beside Tally | B → http://localhost:9000 (no ngrok for Tally) |
| Dev / testing | Same Ubuntu EC2 | Second OpenClaw on same EC2 | B uses ngrok URL to remote Tally |
Instance A reaches B via an inbound tunnel on the client box (ngrok http 8787 or Cloudflare Tunnel). Share BRIDGE_URL, BRIDGE_BEARER, and BRIDGE_HMAC_SECRET with the team running A.
| Step | Setting | Value |
|---|---|---|
| 1 | Host | Ubuntu EC2 (or dev server) |
| 2 | Install | Node.js, OpenClaw |
| 3 | LLM | OpenAI ChatGPT Codex Plus session (subscription) |
| 4 | Skill loaded | tally-extractor-skill/ only — do not load tally-skill/ |
| 5 | Channel | Telegram bot token on A only |
| 6 | BRIDGE_URL | https://<client-tunnel>.ngrok.app (no trailing slash) |
| 7 | BRIDGE_BEARER | Shared secret from client team |
| 8 | BRIDGE_HMAC_SECRET | Shared HMAC secret from client team |
| 9 | Do not set | TALLY_URL on Instance A |
| 10 | Preflight | curl -H "Authorization: Bearer $BRIDGE_BEARER" $BRIDGE_URL/v1/health → tally: ok |
reference/extraction-schema.md — Full JSON schema, examplesreference/bridge.md — Endpoints, auth, errorsreference/prompts.md — User-facing messages