Install
openclaw skills install clawhub-skill-2Configure, run, and troubleshoot the OpenRouter hardware-aware classifier router (wizard setup, local model, routing, and dashboard).
openclaw skills install clawhub-skill-2Xrouter is an open-source inference router that sits between OpenClaw and your LLM providers. It uses a fast, hardware-aware classifier to route each request to the most cost-effective model that can handle the task.
This project is MIT licensed. See the MIT License.
Core Features
POST /v1/chat/completions./dashboard.Workflow
flowchart TD
A["Client / OpenClaw request"] --> B["Router (OpenAI-compatible)"]
B --> C{"Classifier enabled?"}
C -->|No| F["Route to Frontier provider"]
C -->|Yes| D["Classifier (0 / 1 / 2)"]
D --> E{"Decision"}
E -->|0| G["Route to Cheap provider"]
E -->|1| M["Route to Medium provider"]
E -->|2| F
G --> H["Provider adapter (auto or explicit)"]
M --> H
F --> H
H --> I["Upstream API call"]
I --> J["Stream/Response back to client"]
Repository Layout
src/server.js: router and streaming proxy.src/classifier.js: classifier call and retry logic.src/config.js: configuration and env parsing.src/cache.js: Redis + LRU cache.src/token_tracker.js: token tracking.scripts/check_hw.js: hardware detection.scripts/configure_providers.js: interactive provider setup.Requirements
Quickstart
npm install
npm run configure
npm run dev
How To Use
Example local setup (Ollama):
ollama pull llama3.1
ollama run llama3.1
Run the wizard:
npm run configure
Start the router:
npm run dev
Test a request:
curl -i http://localhost:3000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"any","messages":[{"role":"user","content":"Fix this sentence: I has a apple."}]}'
Look for these headers:
X-Xrouter-decision: 0, 1, or 2.X-Xrouter-upstream: cheap, medium, or frontier.Open the dashboard:
http://localhost:3000/dashboardRaw usage JSON:
http://localhost:3000/usageProvider Selection (Terminal Wizard) Run:
npm run configure
The wizard:
upstreams.json and optionally updates .env.Quick Start Mode
Routing Behavior
0, 1, or 2 token returned decides the route.Compatibility
xrouter, openai_compatible, openai, anthropic, gemini, cohere, azure_openai, mistral, groq, together, perplexity) or auto.auto infers the provider adapter from the base URL or API key.openai_compatible adapter.Token Tracking Dashboard
GET /usage: returns cumulative token usage for cheap, medium, and frontier.GET /dashboard: UI that displays token split and totals.cheap when cheap uses the local model.Environment Summary
HOST: bind host, default 0.0.0.0.PORT: bind port, default 3000.ROUTER_API_KEY: require Authorization: Bearer <key>.LOG_LEVEL: log level (debug/info/warn/error).LOG_TO_FILE: set true to write logs to files.LOG_DIR: directory for log files (default ./logs).CLASSIFIER_ENABLED: set false to disable local classification.CLASSIFIER_BASE_URL: OpenAI-compatible classifier endpoint.CLASSIFIER_MODEL: classifier model name.CLASSIFIER_SYSTEM_PROMPT: classifier prompt (single line).CLASSIFIER_TIMEOUT_MS: classifier timeout.CLASSIFIER_FORCE_STREAM: force streaming classifier request.CLASSIFIER_WARMUP: warm the classifier on server start.CLASSIFIER_WARMUP_DELAY_MS: delay before warmup request (ms).CLASSIFIER_KEEP_ALIVE_MS: keep-alive interval for classifier warmup (ms).CLASSIFIER_LOADING_RETRY_MS: delay between retries when the model is loading.CLASSIFIER_LOADING_MAX_RETRIES: max retries when the model is loading.CHEAP_BASE_URL: optional, defaults to classifier base URL.CHEAP_API_KEY: cheap provider API key.CHEAP_MODEL: optional model override for cheap route.CHEAP_PROVIDER: provider type for cheap route (auto if empty).CHEAP_HEADERS: optional JSON headers for cheap provider (stringified object).CHEAP_DEPLOYMENT: Azure deployment override for cheap route.CHEAP_API_VERSION: Azure API version override for cheap route.MEDIUM_BASE_URL: required when classifier is enabled.MEDIUM_API_KEY: medium provider API key.MEDIUM_MODEL: optional model override for medium route.MEDIUM_PROVIDER: provider type for medium route (auto if empty).MEDIUM_HEADERS: optional JSON headers for medium provider (stringified object).MEDIUM_DEPLOYMENT: Azure deployment override for medium route.MEDIUM_API_VERSION: Azure API version override for medium route.FRONTIER_BASE_URL: OpenAI-compatible frontier endpoint.FRONTIER_API_KEY: frontier API key.FRONTIER_MODEL: optional model override for frontier route.FRONTIER_PROVIDER: provider type for frontier route (auto if empty).FRONTIER_HEADERS: optional JSON headers for frontier provider (stringified object).FRONTIER_DEPLOYMENT: Azure deployment override for frontier route.FRONTIER_API_VERSION: Azure API version override for frontier route.REDIS_URL: if set, enables Redis cache.Local Model Installation & Run Guides Ollama (best for Mac, easiest cross-platform)
ollama pull llama3.1ollama run llama3.1http://localhost:11434CLASSIFIER_BASE_URL=http://localhost:11434
CLASSIFIER_MODEL=llama3.1vLLM (NVIDIA GPU)
vllm serve NousResearch/Meta-Llama-3-8B-Instruct --dtype auto --api-key token-abc123http://localhost:8000CLASSIFIER_BASE_URL=http://localhost:8000
CLASSIFIER_MODEL=NousResearch/Meta-Llama-3-8B-InstructTensorRT-LLM (NVIDIA, max speed)
http://<host>:<port>CLASSIFIER_BASE_URL=http://<host>:<port>
CLASSIFIER_MODEL=<your model>llama.cpp (CPU/AMD fallback)
llama-server -m model.gguf --port 8080http://localhost:8080CLASSIFIER_BASE_URL=http://localhost:8080
CLASSIFIER_MODEL=<gguf model name>Docker Build and run the router with Redis:
docker compose -f deploy/docker-compose.yml up --build
Hardware Detection Run:
npm run check-hw
This prints the recommended engine:
tensorrt-llm for large NVIDIA GPUs.vllm for standard NVIDIA GPUs.mlx for Apple Silicon.llama.cpp for CPU/AMD fallback.Model List Fetching
/v1/models/v1/models/v1beta/models/v1/modelsscripts/cloud_model_catalog.json.Star History