Sage Router

Data & APIs

Local-first AI model routing for serious agents. One endpoint. Any provider. The router figures out the rest.

Install

openclaw skills install sage-router

Sage Router

HTTP server on :8788 that routes chat requests to the optimal provider based on intent classification.

Endpoints

  • POST /v1/chat/completions — OpenAI-compatible; routes automatically
  • POST /v1/messages — Anthropic Messages API compatible; translates to/from OpenAI format internally
  • GET /health — Provider status, model lists, routing debug

Any Anthropic-compatible tool (Cursor, Aider, Claude Code, Zed, Continue, OpenHands) can point at http://localhost:8788 as the API base URL. Both streaming and non-streaming are supported.

Active Providers

Providers are discovered from ~/.openclaw/openclaw.json at startup.

Rules:

  • skips the router's own sage-router provider entry to avoid recursion
  • resolves ${ENV_VAR} values for baseUrl and apiKey
  • includes OpenClaw gateway openai-codex as a virtual provider when the auth profile exists
  • recognizes Google Gemini providers from generativelanguage.googleapis.com
  • auto-discovers Google models when the provider exists but models is empty in openclaw.json
  • normalizes anthropic or Anthropic-hosted anthropic-messages providers onto the local Dario proxy at localhost:3456
  • starts the Dario user service when Anthropic compatibility is needed and the service is not already running; in Docker, the image bundles @askalf/dario and autostarts dario proxy when credentials are mounted at /root/.dario
  • supports temporary provider suppression via SAGE_ROUTER_DISABLED_PROVIDERS=name1,name2

GET /health shows:

  • configured: all discovered providers
  • providers: reachable providers with model lists
  • disabled: providers suppressed by env

Routing Logic

The router does not perform mid-stream switching. Once a request is sent to a provider, the full response is returned or the attempt fails. If it fails, the next candidate in the chain is tried sequentially. There is no partial-output fallback or streaming handoff between providers.

Flow:

  • detect intent from the latest user message
  • estimate complexity from prompt length
  • score every reachable (provider, model) pair globally — not per-provider — from openclaw.json
  • in local-first, operate as local-strict: reject centralized Internet API providers and only allow local/LAN/Tailnet endpoints plus approved decentralized providers such as Darkbloom, with Ollama :cloud models excluded
  • for GENERAL, blend static heuristics with persisted empirical latency stats by provider and model
  • rank candidates by API type, model-name hints, complexity, and measured latency
  • attempt the top SAGE_ROUTER_MAX_PROVIDER_ATTEMPTS candidates in order
  • sage-router provider (the router itself, model auto) is scored as a low-priority recursive fallback, never preferred

Intent scoring is generic, for example:

  • code and analysis strongly favor Anthropic/OpenAI-style reasoning models
  • general/realtime requests prefer fast direct providers first
  • general traffic learns from real successful request latency over time, with light exploration for cold providers/models
  • complex prompts boost larger reasoning models and penalize mini/haiku-class models

Intent is detected by keyword matching on the latest user message. Complexity is estimated by word count.

API

  • GET /health — JSON with reachable providers, configured providers, and disabled providers
  • POST /v1/chat/completions — OpenAI-compatible; routes automatically

Notes

  • openai-codex is kept as an optional bridge, not a required first hop.
  • Anthropic compatibility is provided through Dario, so anthropic can stay in openclaw.json while routing locally through dario.
  • The repo systemd unit is template-style and expects local machine values in ~/.config/sage-router/sage-router.env.
  • Empirical latency memory is persisted at ~/.cache/sage-router/latency-stats.json by default.
  • When the OpenClaw gateway model-set path is unhealthy, the helper falls back to running without provider/model overrides instead of failing hard.
  • If any provider starts misbehaving, suppress it with SAGE_ROUTER_DISABLED_PROVIDERS instead of editing the router.
  • For reliable Umbrel/OpenClaw/Remnic use, point clients at http://sage-router:8788/v1 on umbrel_main_network, set unauthenticated Ollama auto-pull patterns to empty, and keep quota-bound providers disabled until credentials are healthy.
  • GitHub workflows now include CI syntax checks and CodeQL analysis for Python + JavaScript.
  • See BRANCH_PROTECTION.md for the exact required-check setup on GitHub.
  • provider-profiles.json includes a grok-sso template for the OpenClaw xAI auth plugin's local SuperGrok-backed proxy.

Install

Install the user service from the repo copy:

mkdir -p ~/.config/systemd/user ~/.config/sage-router
cp systemd/sage-router.service ~/.config/systemd/user/sage-router.service
cp systemd/sage-router.env.example ~/.config/sage-router/sage-router.env
# edit ~/.config/sage-router/sage-router.env for your machine
systemctl --user daemon-reload
systemctl --user enable --now sage-router.service

Notes:

  • the repo unit is now env-driven and does not hardcode your home path, Node version, or workspace location
  • set SAGE_ROUTER_HOME to the actual repo path on your machine
  • optionally set SAGE_ROUTER_PATH_PREFIX if your Python, Node, or Dario bins are not already on PATH

If an Anthropic provider is detected and Dario is not installed yet, install Dario first:

Service

systemctl --user status sage-router
systemctl --user restart sage-router
journalctl --user -u sage-router -f   # live logs

Docker production notes

  • Docker image includes Node, Python, Sage Router, and @askalf/dario.
  • Mount host Dario credentials as ~/.dario:/root/.dario for Anthropic-compatible Claude routing.
  • Enable llama.cpp classifier sidecar with docker compose --profile classifier up -d and SAGE_ROUTER_INTENT_CLASSIFIER_ENABLED=1.
  • Production classifier flags: SAGE_ROUTER_INTENT_CLASSIFIER_PROVIDER=llamacpp, SAGE_ROUTER_INTENT_CLASSIFIER_BASE_URL=http://llamacpp-classifier:8080, SAGE_ROUTER_INTENT_CLASSIFIER_MODEL=classifier.

Router profiles

Sage Router supports named routing profiles in router-profiles.json next to router.py.

Request a profile with any of:

  • model: "sage-router/<profile>"
  • model: "<profile>"
  • top-level profile, routerProfile, or sageRouterProfile

Profile fields currently supported:

  • route: fast, balanced, best, local-first, realtime
  • thinking: low, medium, high
  • capability/quality flags: requiresQuality, requiresReasoning, requiresTools, frontierLargeOnly, frontierOrReasoningTools, suppressIntermediateToolText, qualitySensitive, reasoning, tools, preferTools, json, vision, document, longContext
  • constraints: allowProviders, denyProviders, allowModels, denyModels, minParamsB

Current profiles:

  • frontier: default high-quality frontier routing profile. Forces best/high, reasoning, quality-sensitive, suppresses tool-call narration, and blocks tiny/free filler models.
  • frontier-large: strict frontier-large-only routing.
  • fast-local: low-latency local-first routing.
  • coding-max: high-thinking code route with weak model exclusions.