Clawd Throttle

Routes LLM requests to the cheapest capable model across 8 providers (Anthropic, Google, OpenAI, DeepSeek, xAI, Moonshot, Mistral, Ollama) and 25+ models. Scores prompts on 8 dimensions in under 1ms, supports three routing modes (eco, standard, gigachad), and logs all decisions for cost tracking.

Audits

Pass

Install

openclaw skills install clawd-throttle

Clawd Throttle

Route every LLM request to the cheapest model that can handle it. Stop paying Opus prices for "hello" and "summarize this."

Supports 8 providers and 25+ models: Anthropic (Claude), Google (Gemini), OpenAI (GPT / o-series), xAI (Grok), DeepSeek, Moonshot (Kimi), Mistral, and Ollama (local).

How It Works

  1. Your prompt arrives
  2. The classifier scores it on 8 dimensions (token count, code presence, reasoning markers, simplicity indicators, multi-step patterns, question count, system prompt complexity, conversation depth) in under 1 millisecond
  3. The router maps the resulting tier (simple / standard / complex) to a model based on your active mode and configured providers
  4. The request is proxied to the correct API
  5. The routing decision and cost are logged to a local JSONL file

Routing Modes

ModeSimpleStandardComplex
ecoGrok 4.1 FastGemini FlashHaiku
standardGrok 4.1 FastHaikuSonnet
gigachadHaikuSonnetOpus 4.6

Each cell shows the first-choice model. The router tries a preference list and falls through to the next available provider if the first is not configured.

Available Commands

CommandWhat It Does
route_requestSend a prompt and get a response from the cheapest capable model
classify_promptAnalyze prompt complexity without making an LLM call
get_routing_statsView cost savings and model distribution stats
get_configView current configuration (keys redacted)
set_modeChange routing mode at runtime
get_recent_routing_logInspect recent routing decisions

Overrides

  • Heartbeats and summaries always route to the cheapest model
  • Type /opus, /sonnet, /haiku, /flash, or /grok-fast to force a specific model
  • Sub-agent calls automatically step down one tier from their parent

Setup

  1. Get at least one API key (Anthropic or Google required; others optional):
  2. Run the setup script:
    npm run setup
    
  3. Choose your routing mode (eco / standard / gigachad)

Privacy

  • Prompt content is never stored. Only a SHA-256 hash is logged.
  • All data stays local in ~/.config/clawd-throttle/
  • API keys stored in your local config file