Install
openclaw skills install project-nirvana-pluginLocal-first privacy-first inference. Your OpenClaw agent thinks locally and asks the cloud intelligently. Saves 85%+ tokens, protects privacy, agent learns from cloud responses—cloud doesn't learn from you.
openclaw skills install project-nirvana-pluginA new way of thinking about LLM access. Your agent thinks locally, asks the cloud intelligently, and learns from the response. The cloud never sees your private data.
Today's approach leaks your privacy and wastes 85% of your API budget.
Every time you ask your OpenClaw agent a question:
Your agent builds a "system prompt" containing:
All of this gets sent to cloud APIs (OpenAI, Anthropic, Google)
You pay for thousands of extra tokens
The cloud provider trains its next model on your private data
This is the current default. It's inefficient and it's a privacy disaster.
Local-first inference that protects privacy and slashes costs.
Nirvana flips the paradigm:
| Aspect | Today (Default) | Nirvana |
|---|---|---|
| Where thinking happens | Cloud only | Local first, cloud when needed |
| What gets sent to cloud | Your full context + system prompts | Agent's sanitized query only |
| Who learns from your data | Cloud provider | You (local agent) |
| Token cost per interaction | 2,000–5,000 tokens | 50–300 tokens |
| Savings | — | 85%+ token reduction |
| Privacy | Leaked | Protected |
@local or @cloud hints respectedUser asks your agent a question
↓
┌─────────────────────────────────────────┐
│ Nirvana Router │
│ "Can qwen2.5:7b answer this locally?" │
└─────────────────────────────────────────┘
↙ ↘
[LOCAL PATH] [CLOUD PATH]
80%+ of queries 20%- of queries
Ollama (qwen2.5:7b) OpenAI/Anthropic/Google
Free Pay for answer
Private Cloud sees sanitized query only
~1s latency ~3s latency
Result cached locally Result cached locally
↓ ↓
└─────────────┬─────────────────────┘
↓
Agent answers your question using:
- Local inference (primary)
- Cloud intelligence (if needed)
- Cached knowledge (if available)
YOUR PRIVATE DATA NEVER LEFT YOUR SYSTEM
# Install the plugin
clawhub install shivaclaw/nirvana
# Start Ollama container (pulls auto on first run)
docker run -d -p 11434:11434 ollama/ollama
# Verify
openclaw nirvana status
# Install the skill (context stripping only)
clawhub install shivaclaw/nirvana-local
# Configure endpoint
openclaw nirvana configure --local-endpoint http://your-llm:5000
# Verify
openclaw nirvana status
| Scenario | Today | With Nirvana | Savings |
|---|---|---|---|
| 10 questions/day | 20,000 tokens/day | 3,000 tokens/day | 85% |
| 100 questions/day | 200,000 tokens/day | 30,000 tokens/day | 85% |
| Monthly cost (OpenAI GPT-4) | $500–$1,000 | $75–$150 | 85% |
Local inference is free. Only pay for the 15%–20% of queries that truly need frontier models.
# View what was sent to cloud this session
openclaw nirvana audit-log
# Output:
# 2026-04-24 14:23:45 — CLOUD API CALL
# Original query: [REDACTED]
# Sanitized query sent: "Explain quantum entanglement"
# Response cached: Yes
# User data in request: None
| Platform | Status | Notes |
|---|---|---|
| Linux (Ubuntu/Debian) | ✅ Full | Ollama container + native binaries |
| macOS (Intel/ARM) | ✅ Full | Ollama via Docker or native |
| Windows (WSL2) | ✅ Full | Ollama in WSL2 container |
| VPS (Hostinger, DigitalOcean, AWS) | ✅ Full | Docker Compose ready |
| Docker container | ✅ Full | Orchestrated via docker-compose |
| Air-gapped (offline) | ✅ Full | Local-only mode (no cloud fallback) |
{
"nirvana": {
"mode": "local-first",
"local_model": {
"provider": "ollama",
"endpoint": "http://ollama:11434",
"model": "qwen2.5:7b",
"timeout_ms": 180000
},
"routing": {
"local_threshold": 0.75,
"max_local_tokens": 8000,
"cloud_fallback": true
},
"privacy": {
"strip_soul": true,
"strip_user": true,
"strip_memory": true,
"audit_logging": true
}
}
}
{
"nirvana": {
"local_model": {
"provider": "custom",
"endpoint": "http://your-llm-server:5000",
"api_format": "openai-compatible",
"model": "your-model-name",
"timeout_ms": 120000
}
}
}
Your agent should train itself. The cloud should not train on you.
Today's default paradigm:
Nirvana's paradigm:
| Component | Purpose |
|---|---|
| router.ts | Decides local vs cloud routing |
| context-stripper.ts | Removes private data before cloud API calls |
| privacy-auditor.ts | Logs all boundary crossings |
| response-integrator.ts | Caches cloud responses locally |
| ollama-manager.ts | Handles Ollama lifecycle + model management |
| metrics-collector.ts | Tracks performance + cost + privacy |
| config.schema.json | Configuration validation |
MIT-0 — Free to use, modify, and redistribute. No attribution required.
Your agent deserves privacy. Nirvana makes it real.