Intelligent Model Router

Intelligent model routing for sub-agent task delegation. Choose the optimal model based on task complexity, cost, and capability requirements. Reduces costs...

MIT-0 · Free to use, modify, and redistribute. No attribution required.

⭐ 1 · 1.9k · 17 current installs · 20 all-time installs

by@bowen31337

MIT-0

Security Scan

VirusTotal

Suspicious

View report →

OpenClaw

Suspicious

medium confidence

ℹ

Purpose & Capability

The code (router, policy, classifier, provider health, discovery, spawn helper) aligns with the stated purpose of model routing and policy enforcement. Recommending cloud-safe models and discovering provider models is expected. The only minor surprise is the installation step that 'enforces' a mandatory protocol by patching AGENTS.md — this is functionally consistent with 'infrastructure' but is an intrusive side-effect that should be explicit to the operator.

Instruction Scope

Runtime instructions require running install.sh which will append a mandatory protocol into a global AGENTS.md (git root or $HOME/clawd/AGENTS.md). The scripts also read ~/.openclaw/openclaw.json, interact with the openclaw CLI (cron list, sessions send), scan and audit crons, and make live HTTP calls to configured model provider endpoints. The SKILL.md does not declare these config/credential reads explicitly. Reading system-wide agent config, scanning crons, and contacting external model endpoints go beyond a simple classifier and grant access to many secrets and system state.

ℹ

Install Mechanism

There is no external download or remote installer; install.sh is included in the package and modifies AGENTS.md then runs a local router test. That avoids supply-chain download risk but install.sh does perform a repository/global file patch (AGENTS.md) and executes local Python scripts — an operator should review the script before running. No network-based installer was found.

Credentials

The skill declares no required env vars, but multiple scripts read OpenClaw configuration (~/.openclaw/openclaw.json), provider definitions (including apiKey and baseUrl), and write provider-health.json under the OpenClaw workspace. discover_models.py will use provider-configured baseUrl and apiKey to do live inference tests — this means the skill will use whatever API keys are present in your OpenClaw config without explicitly requesting them. This is proportionate for a router that must know available models, but the lack of explicit declaration and the ability to send those keys to arbitrary baseUrl endpoints is a significant practical risk.

Persistence & Privilege

The installer will append a mandatory protocol to AGENTS.md (repository or $HOME/clawd) — a persistent, global change. The skill also creates/updates files under ~/.openclaw (provider-health.json, discovered-models.json, and uses config.json in the skill dir). It does not set always: true, but the install's self-integration into AGENTS.md constitutes a strong persistent policy enforcement action that affects agent behavior system-wide and should be explicitly approved by the operator.

What to consider before installing

Things to check before installing: - Review install.sh and back up AGENTS.md: install.sh appends mandatory protocol text to AGENTS.md in your repo or $HOME/clawd; back up that file and inspect the installer before running. - Inspect your OpenClaw config: the skill reads ~/.openclaw/openclaw.json and will use provider apiKey and baseUrl entries to perform live checks. If your config contains sensitive keys, the scripts will use them automatically. Make sure provider baseUrl values are trusted — discover_models.py will POST to those endpoints with the keys from your config. - Audit model provider entries: because baseUrl comes from your config, a malicious or misconfigured provider entry could cause keys to be sent to an attacker-controlled server. Only include trusted providers and verify the baseUrl for each provider. - Run in staging first: run the scripts (discover, router_policy, router.py) in a safe environment or with a copy of your config that has no secret keys to observe behavior. - Check persistence & workspace files: provider-health.json and discovered-models.json will be created/updated under .openclaw workspace — ensure you’re comfortable with these files being added and with the read/write behavior. - Ensure you accept the enforcement behavior: this skill is explicit about enforcing a protocol; if you do not want automated edits to AGENTS.md or a mandatory system-wide recommendation, do not run the installer. - If you want to proceed: audit the code paths that call network endpoints (discover_models.py), the router_policy audit (which invokes the openclaw CLI), and any code that writes outside the skill directory. Consider running the discovery and audit commands with a copy of openclaw.json that omits API keys to verify logic without exposing secrets. If you want, I can list the specific functions and lines that read configs, call network endpoints, or patch AGENTS.md to make your review quicker.

Like a lobster shell, security has layers — review code before you run it.

Current versionv3.0.1

Download zip

latestvk976knyzf75x49k4t3gcsm2g31829kyg

License

MIT-0

Free to use, modify, and redistribute. No attribution required.

Termshttps://spdx.org/licenses/MIT-0.html

SKILL.md

Intelligent Router — Core Skill

CORE SKILL: This skill is infrastructure, not guidance. Installation = enforcement. Run bash skills/intelligent-router/install.sh to activate.

What It Does

Automatically classifies any task into a tier (SIMPLE/MEDIUM/COMPLEX/REASONING/CRITICAL) and recommends the cheapest model that can handle it well.

The problem it solves: Without routing, every cron job and sub-agent defaults to Sonnet (expensive). With routing, monitoring tasks use free local models, saving 80-95% on cost.

MANDATORY Protocol (enforced via AGENTS.md)

Before spawning any sub-agent:

python3 skills/intelligent-router/scripts/router.py classify "task description"

Before creating any cron job:

python3 skills/intelligent-router/scripts/spawn_helper.py "task description"
# Outputs the exact model ID and payload snippet to use

To validate a cron payload has model set:

python3 skills/intelligent-router/scripts/spawn_helper.py --validate '{"kind":"agentTurn","message":"..."}'

❌ VIOLATION (never do this):

# Cron job without model = Sonnet default = expensive waste
{"kind": "agentTurn", "message": "check server..."}  # ← WRONG

✅ CORRECT:

# Always specify model from router recommendation
{"kind": "agentTurn", "message": "check server...", "model": "ollama/glm-4.7-flash"}

Tier System

Tier	Use For	Primary Model	Cost
🟢 SIMPLE	Monitoring, heartbeat, checks, summaries	`anthropic-proxy-6/glm-4.7` (alt: proxy-4)	$0.50/M
🟡 MEDIUM	Code fixes, patches, research, data analysis	`nvidia-nim/meta/llama-3.3-70b-instruct`	$0.40/M
🟠 COMPLEX	Features, architecture, multi-file, debug	`anthropic/claude-sonnet-4-6`	$3/M
🔵 REASONING	Proofs, formal logic, deep analysis	`nvidia-nim/moonshotai/kimi-k2-thinking`	$1/M
🔴 CRITICAL	Security, production, high-stakes	`anthropic/claude-opus-4-6`	$5/M

SIMPLE fallback chain: anthropic-proxy-4/glm-4.7 → nvidia-nim/qwen/qwen2.5-7b-instruct ($0.15/M)

⚠️ ollama-gpu-server is BLOCKED for cron/spawn use. Ollama binds to 127.0.0.1 by default — unreachable over LAN from the OpenClaw host. The router_policy.py enforcer will reject any payload referencing it.

Tier classification uses 4 capability signals (not cost alone):

effective_params (50%) — extracted from model ID or known-model-params.json for closed-source models
context_window (20%) — larger = more capable
cost_input (20%) — price as quality proxy (weak signal, last resort for unknown sizes)
reasoning_flag (10%) — bonus for dedicated thinking specialists (R1, QwQ, Kimi-K2)

Policy Enforcer (NEW in v3.2.0)

router_policy.py catches bad model assignments before they are created, not after they fail.

Validate a cron payload before submitting

python3 skills/intelligent-router/scripts/router_policy.py check \
  '{"kind":"agentTurn","model":"ollama-gpu-server/glm-4.7-flash","message":"check server"}'
# Output: VIOLATION: Blocked model 'ollama-gpu-server/glm-4.7-flash'. Recommended: anthropic-proxy-6/glm-4.7

Get enforced model recommendation for a task

python3 skills/intelligent-router/scripts/router_policy.py recommend "monitor alphastrike service"
# Output: Tier: SIMPLE  Model: anthropic-proxy-6/glm-4.7

python3 skills/intelligent-router/scripts/router_policy.py recommend "monitor alphastrike service" --alt
# Output: Tier: SIMPLE  Model: anthropic-proxy-4/glm-4.7  ← alternate key for load distribution

Audit all existing cron jobs

python3 skills/intelligent-router/scripts/router_policy.py audit
# Scans all crons, reports any with blocked or missing models

Show blocklist

python3 skills/intelligent-router/scripts/router_policy.py blocklist

Policy rules enforced

Model must be set — no model field = Sonnet default = expensive waste
No blocked models — ollama-gpu-server/* and bare ollama/* are rejected for cron use
CRITICAL tasks — warns if using a non-Opus model for classified-critical work

Installation (Core Skill Setup)

Run once to self-integrate into AGENTS.md:

bash skills/intelligent-router/install.sh

This patches AGENTS.md with the mandatory protocol so it's always in context.

CLI Reference

# ── Policy enforcer (run before creating any cron/spawn) ──
python3 skills/intelligent-router/scripts/router_policy.py check '{"kind":"agentTurn","model":"...","message":"..."}'
python3 skills/intelligent-router/scripts/router_policy.py recommend "task description"
python3 skills/intelligent-router/scripts/router_policy.py recommend "task" --alt  # alternate proxy key
python3 skills/intelligent-router/scripts/router_policy.py audit     # scan all crons
python3 skills/intelligent-router/scripts/router_policy.py blocklist

# ── Core router ──
# Classify + recommend model
python3 skills/intelligent-router/scripts/router.py classify "task"

# Get model id only (for scripting)
python3 skills/intelligent-router/scripts/spawn_helper.py --model-only "task"

# Show spawn command
python3 skills/intelligent-router/scripts/spawn_helper.py "task"

# Validate cron payload has model set
python3 skills/intelligent-router/scripts/spawn_helper.py --validate '{"kind":"agentTurn","message":"..."}'

# List all models by tier
python3 skills/intelligent-router/scripts/router.py models

# Detailed scoring breakdown
python3 skills/intelligent-router/scripts/router.py score "task"

# Config health check
python3 skills/intelligent-router/scripts/router.py health

# Auto-discover working models (NEW)
python3 skills/intelligent-router/scripts/discover_models.py

# Auto-discover + update config
python3 skills/intelligent-router/scripts/discover_models.py --auto-update

# Test specific tier only
python3 skills/intelligent-router/scripts/discover_models.py --tier COMPLEX

Scoring System

15-dimension weighted scoring (not just keywords):

Reasoning markers (0.18) — prove, theorem, derive
Code presence (0.15) — code blocks, file extensions
Multi-step patterns (0.12) — first...then, numbered lists
Agentic task (0.10) — run, fix, deploy, build
Technical terms (0.10) — architecture, security, protocol
Token count (0.08) — complexity from length
Creative markers (0.05) — story, compose, brainstorm
Question complexity (0.05) — multiple who/what/how
Constraint count (0.04) — must, require, exactly
Imperative verbs (0.03) — analyze, evaluate, audit
Output format (0.03) — json, table, markdown
Simple indicators (0.02) — check, get, show (inverted)
Domain specificity (0.02) — acronyms, dotted notation
Reference complexity (0.02) — "mentioned above"
Negation complexity (0.01) — not, never, except

Confidence: 1 / (1 + exp(-8 × (score - 0.5)))

Config

Models defined in config.json. Add new models there, router picks them up automatically. Local Ollama models have zero cost — always prefer them for SIMPLE tasks.

Auto-Discovery (Self-Healing)

The intelligent-router can automatically discover working models from all configured providers via real live inference tests (not config-existence checks).

How It Works

Provider Scanning: Reads ~/.openclaw/openclaw.json → finds all models
Live Inference Test: Sends "hi" to each model, checks it actually responds (catches auth failures, quota exhaustion, 404s, timeouts)
OAuth Bypass: Providers with sk-ant-oat01-* tokens (Anthropic OAuth) are skipped in raw HTTP — OpenClaw refreshes these transparently, so they're always marked available
Thinking Model Support: Models that return content=None + reasoning_content (GLM-4.7, Kimi-K2, Qwen3-thinking) are correctly detected as available
Auto-Classification: Tiers assigned via tier_classifier.py using 4 capability signals
Config Update: Removes unavailable models, rebuilds tier primaries from working set
Cron: Hourly refresh (cron id: a8992c1f) keeps model list current, alerts if availability changes by >2

Usage

# One-time discovery
python3 skills/intelligent-router/scripts/discover_models.py

# Auto-update config with working models only
python3 skills/intelligent-router/scripts/discover_models.py --auto-update

# Set up hourly refresh cron
openclaw cron add --job '{
  "name": "Model Discovery Refresh",
  "schedule": {"kind": "every", "everyMs": 3600000},
  "payload": {
    "kind": "systemEvent",
    "text": "Run: bash skills/intelligent-router/scripts/auto_refresh_models.sh",
    "model": "ollama/glm-4.7-flash"
  }
}'

Benefits

✅ Self-healing: Automatically removes broken models (e.g., expired OAuth) ✅ Zero maintenance: No manual model list updates ✅ New models: Auto-adds newly released models ✅ Cost optimization: Always uses cheapest working model per tier

Discovery Output

Results saved to skills/intelligent-router/discovered-models.json:

{
  "scan_timestamp": "2026-02-19T21:00:00",
  "total_models": 25,
  "available_models": 23,
  "unavailable_models": 2,
  "providers": {
    "anthropic": {
      "available": 2,
      "unavailable": 0,
      "models": [...]
    }
  }
}

Pinning Models

To preserve a model even if it fails discovery:

{
  "id": "special-model",
  "tier": "COMPLEX",
  "pinned": true  // Never remove during auto-update
}

⚠️ Known Gap — Proactive Health-Based Routing (2026-03-04)

Current router is reactive not proactive:

Fallback only fires AFTER a 429 is received
No awareness of concurrent sessions on same proxy
No cooldown tracking after rate-limit events

Needed improvements:

Track last-429 timestamp per provider → skip if within cooldown window
Track active concurrent spawns per provider → if >1 active, route to OAuth
Before spawning N parallel agents, check if single provider can handle N concurrent
Expose router.get_best_available(n_concurrent=2) API

Files

12 total

Select a file

Select a file to preview.

Comments

Loading comments…