Local-First LLM

Routes LLM requests to a local model (Ollama, LM Studio, llamafile) before falling back to cloud APIs. Tracks token savings and cost avoidance in a persisten...

MIT-0 · Free to use, modify, and redistribute. No attribution required.

⭐ 1 · 365 · 1 current installs · 1 all-time installs

by@joelnishanth

MIT-0

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Benign

high confidence

✓

Purpose & Capability

Name/description match the implemented behavior: scripts check local localhost endpoints, decide route, log savings to ~/.openclaw/local-first-llm/savings.json, and render a dashboard. Required binary (python3) is appropriate and proportional.

✓

Instruction Scope

SKILL.md instructs running included Python scripts and local HTTP calls to localhost LLM servers. The instructions do not ask to read unrelated system files, exfiltrate data, or call external endpoints (aside from documented local host endpoints). It does document how to invoke cloud fallbacks but does not itself transmit prompts to remote endpoints.

ℹ

Install Mechanism

There is no automated install spec (instruction-only). All code is included. Reference docs suggest installing third-party local providers (e.g., an example curl | sh for Ollama) — that is documentation only, not performed by the skill; users should vet any external install commands before running them.

✓

Credentials

The skill requests no environment variables or external credentials. It only reads/writes a per-user file under ~/.openclaw and queries localhost ports for provider detection — these are proportionate to the functionality.

✓

Persistence & Privilege

The skill writes its own data file (~/.openclaw/local-first-llm/savings.json) and does not claim always:true or modify other skills or system-wide settings. Default autonomous invocation is allowed (platform default) and is not combined with other concerning privileges.

Assessment

This skill is internally consistent with its description. Before installing, note: (1) it will create and update ~/.openclaw/local-first-llm/savings.json to track requests — if you don't want that, remove or sandbox the directory; (2) the skill only probes localhost (ports 11434, 1234, 8080) to detect local LLM servers — ensure you trust services running on those ports; (3) reference docs include example third-party install commands (e.g., curl | sh for Ollama) — do not run those without vetting the upstream source; (4) the skill does not request any cloud API keys, but if you wire it to call cloud APIs yourself, review those calls separately. Overall: reasonable to use, but verify any external provider install steps and the location where savings are persisted.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.0.0

Download zip

latestvk971sxgvkvrxztq77ch0fr990x81rdct

License

MIT-0

Free to use, modify, and redistribute. No attribution required.

Termshttps://spdx.org/licenses/MIT-0.html

Runtime requirements

🏠 Clawdis

Binspython3

SKILL.md

Local-First LLM

Route requests to a local LLM first; fall back to cloud only when necessary. Track every decision to show real token and cost savings.

Quick Start

1. Check if a local LLM is running

python3 skills/local-first-llm/scripts/check_local.py

Returns JSON: { "any_available": true, "best": { "provider": "ollama", "models": [...] } }

2. Route a request

python3 skills/local-first-llm/scripts/route_request.py \
  --prompt "Summarize this meeting transcript" \
  --tokens 800 \
  --local-available \
  --local-provider ollama

Returns: { "decision": "local", "reason": "...", "complexity_score": -1 }

3. Log the outcome

After executing the request, record it:

python3 skills/local-first-llm/scripts/track_savings.py log \
  --tokens 800 \
  --model gpt-4o \
  --routed-to local

4. Show the dashboard

python3 skills/local-first-llm/scripts/dashboard.py

Full Routing Workflow

┌─────────────────────────────────────────────────────┐
│  1. check_local.py  →  is a local provider running? │
│                                                      │
│  2. route_request.py  →  local or cloud?             │
│     - sensitivity check  (private data → local)      │
│     - complexity score   (high score → cloud)        │
│     - availability gate  (no local → cloud)          │
│                                                      │
│  3. Execute with the chosen provider                 │
│                                                      │
│  4. track_savings.py log  →  record the outcome      │
│                                                      │
│  5. dashboard.py  →  show cumulative savings         │
└─────────────────────────────────────────────────────┘

Routing Rules (Summary)

Condition	Route
No local provider available	☁️ Cloud
Prompt contains sensitive data (`password`, `secret`, `api key`, `ssn`, etc.)	🏠 Local
Complexity score ≥ 3	☁️ Cloud
Complexity score < 3	🏠 Local

For full scoring details, see references/routing-logic.md.

Executing with a Local Provider

Once route_request.py returns "decision": "local", send the request:

Ollama

curl http://localhost:11434/api/generate \
  -d '{"model": "llama3.2", "prompt": "YOUR_PROMPT", "stream": false}'

LM Studio / llamafile (OpenAI-compatible)

curl http://localhost:1234/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "local-model", "messages": [{"role": "user", "content": "YOUR_PROMPT"}]}'

Dashboard

The dashboard reads from ~/.openclaw/local-first-llm/savings.json (auto-created).

┌─────────────────────────────────────────┐
│    🧠  Local-First LLM — Dashboard      │
├─────────────────────────────────────────┤
│  Local LLM:  ✅  ollama (llama3.2...)   │
├─────────────────────────────────────────┤
│  Total requests:         42             │
│  Routed locally:         31  (73.8%)    │
│  Routed to cloud:        11             │
├─────────────────────────────────────────┤
│  Tokens saved:       84,200             │
│  Cost saved:           $0.4210          │
└─────────────────────────────────────────┘

Reset savings data:

python3 skills/local-first-llm/scripts/track_savings.py reset

Additional References

Routing scoring details: references/routing-logic.md
Local provider setup (Ollama, LM Studio, llamafile): references/local-providers.md
Token estimation & cloud cost table: references/token-estimation.md

Files

8 total

Select a file

Select a file to preview.

Comments

Loading comments…