Install
openclaw skills install @nelmaz/inception-token-optimizerOptimize Inception Labs token usage to minimize costs. Use when choosing Inception models (Mercury, etc.), crafting prompts for Inception, analyzing token consumption, or when the user wants to reduce API costs. Covers caching strategies, context pruning, prompt compression, model selection tips, and free-tier budget management.
openclaw skills install @nelmaz/inception-token-optimizerReduce Inception API token consumption through prompt engineering, context management, and budget enforcement.
| Metric | Cap |
|---|---|
| Requests/min | 100 |
| Input tokens/min | 100,000 |
| Output tokens/min | 10,000 |
len(text) // 4 (rough heuristic).references/pruning-strategies.md for detailed patterns.scripts/lru_cache.py provides a drop-in LRU cache (256 items default).max_tokens explicitly — never leave it open-ended.temperature=0.7 to reduce verbose wandering.scripts/token_bucket.py enforces per-minute caps using a sliding window:
from scripts.token_bucket import TokenBucket
bucket = TokenBucket(req_per_min=100, in_tok_per_min=100_000, out_tok_per_min=10_000)
bucket.wait_for_slot(in_tokens=500, out_tokens=200)
# proceed with API call
Blocks until a slot is available. Use before every Inception API call.