Install
openclaw skills install clawsaverReduce model API costs by 20–40% through intelligent message batching. Buffer related messages, send once.
openclaw skills install clawsaverReduce model API costs by 20–40% through intelligent message batching and buffering.
Most agent systems waste money on redundant API calls. When users send follow-up messages, you call the model separately for each one. ClawSaver fixes this by waiting ~800ms to collect related messages, then sending them together in a single optimized request. Same response quality. Lower cost. No user friction.
WITHOUT CLAWSAVER (Context Overhead Hidden):
User: "What is ML?"
Model: → API Call #1 [Context: system prompt, chat history] (cost: $X)
Returns: definition
User: "Give an example"
Model: → API Call #2 [Context: system prompt, chat history, Q1, A1] (cost: $X)
Returns: example
User: "Apply to finance?"
Model: → API Call #3 [Context: system prompt, chat history, Q1–A2] (cost: $X)
Returns: finance application
Total: 3 calls × full context = 3X cost, each call repeats context overhead
───────────────────────────────────────
WITH CLAWSAVER (Single Context Load):
User: "What is ML?" ← Buffer (800ms wait)
User: "Give an example" ← Buffer (800ms wait)
User: "Apply to finance?" ← Flush: Send all 3 together
Model: → API Call #1 [Context loaded ONCE: system prompt, chat history]
Processes all 3 questions together
Returns: comprehensive answer addressing all three
Total: 1 call × full context = 1X cost, context overhead paid once
Actual savings (with context): 67% reduction
Cost per token: 1/3 (fewer context re-loads + consolidation)
Why it matters: Context (system prompts, history, instructions) gets re-sent on every API call. With ClawSaver, you pay that context overhead once per batch instead of three times. This compounds the savings beyond just "fewer calls."
Example (4K token context, 200 output tokens):
User: "What is machine learning?"
(pause)
User: "Give an example"
(pause)
User: "How does that apply to healthcare?"
Without optimization: 3 API calls = 3x cost
With ClawSaver: 1 batched call = 1/3 the price
Across thousands of conversations, this compounds fast.
Why users don't notice: They're already waiting for your model response. Buffering input doesn't feel slower because the response comes right after the batch sends.
clawhub install clawsaver
import SessionDebouncer from 'clawsaver';
const debouncers = new Map();
function handleMessage(userId, text) {
if (!debouncers.has(userId)) {
debouncers.set(userId, new SessionDebouncer(
userId,
(msgs) => callModel(userId, msgs)
));
}
debouncers.get(userId).enqueue({ text });
}
| Metric | Value |
|---|---|
| Cost reduction | 20–40% typical |
| Setup time | 10 minutes |
| Code added | ~10 lines |
| Dependencies | 0 |
| File size | 4.2 KB |
| Latency added | +800ms (user-imperceptible) |
| Maintenance | None |
Choose based on your use case:
✅ Chat applications
✅ Customer support bots
✅ Multi-turn Q&A
✅ Any conversation with follow-ups
❌ Single-request workflows
❌ Sub-100ms response requirements
new SessionDebouncer(userId, handler, {
debounceMs: 800, // wait time
maxWaitMs: 3000, // absolute max
maxMessages: 5, // batch size cap
maxTokens: 2048 // reserved
})
// Methods
debouncer.enqueue(message) // add to batch
debouncer.forceFlush(reason) // send now
debouncer.getState() // buffer + metrics
debouncer.getStatusString() // human-readable
Data never leaves your machine.
MIT
Start here: Pick your path in START_HERE.md, or jump to QUICKSTART.md for 5-minute setup.