Install
openclaw skills install @fanyadan/super-routerOpenClaw skill for LangGraph-based task routing between PRO and FLASH models. Use it when a task should be decomposed into atomic subtasks, when multi-entity work needs parallel fanout, or when you want structured complexity scoring with FLASH->PRO escalation instead of choosing a single model manually.
openclaw skills install @fanyadan/super-routerIntelligent task decomposition and model routing using LangGraph StateGraph. Automatically routes subtasks between PRO (heavy reasoning) and FLASH (fast) models based on structured complexity assessment.
This package is intended to live in an OpenClaw skill directory such as ~/.openclaw/skills/super-router.
Use super-router when you need:
Not needed for: Simple single-turn tasks, tasks where you already know which model to use, or when you want manual control over every routing decision.
To achieve true parallel execution (when ROUTER_MAX_CONCURRENCY > 1), the Planner must be instructed to use Atomic Decomposition.
planned_subtasks count matches the entity count. If the planner groups entities, it should be treated as a capability failure and forced to retry with a correction prompt.| Node | Function |
|---|---|
| Planner | Decomposes original task into a JSON array of atomic, actionable subtasks. Uses Atomic Decomposition to split multi-entity tasks (e.g., 10 providers -> 10 subtasks) for maximum parallelism. |
| Judge | Scores each subtask on 5 dimensions: reasoning_depth, code_change_scope, ambiguity, risk, io_heaviness; combines with thresholds + confidence to decide PRO/FLASH |
| Executor Fanout | Uses LangGraph Send(...) to dispatch independent subtasks concurrently, then joins ordered results by original step number |
| PRO Executor Branch | Heavy reasoning model (default: Gemini CLI preview model; override via ROUTER_PRO_MODEL) |
| FLASH Executor Branch | Fast model with review/retry logic (default: Gemini CLI preview model; override via ROUTER_FLASH_MODEL) |
| FLASH Review | Validates output quality; distinguishes infra failures (timeout, network) from capability failures; retries FLASH or escalates to PRO |
| Metadata Extractor | Extracts 'Technical Gold' (atomic high-precision facts) from step output to prevent finalizer timeouts and loss of detail |
| Recorder/Finalizer | Logs every step; compiles final report using a hybrid of Technical Gold and full audit trails; supports FLASH->PRO->deterministic fallback chain |
pip install langgraph
Keep the repository in an OpenClaw-accessible skill directory such as ~/.openclaw/skills/super-router.
If you use Ollama-backed roles, ensure Ollama is running locally and pull the models you want to use:
ollama serve
# Pull recommended models if you use Ollama-backed roles
ollama pull gemma4:26b # Planner or PRO executor (high quality, slow)
ollama pull llama3.1:8b # Judge (fast scoring, recommended)
ollama pull qwen3 # PRO executor
ollama pull qwen2.5:7b # FLASH executor
Note: If you prefer gemma4:26b as the Planner, keep it there. For speed, the Judge should usually be llama3.1:8b or another 7B-14B model:
export ROUTER_PLANNER_MODEL=gemma4:26b
export ROUTER_JUDGE_MODEL=llama3.1:8b
export ROUTER_PRO_MODEL=gemma4:26b
export ROUTER_FLASH_MODEL=qwen2.5:7b
If you intentionally want an all-gemma4:26b Planner/Judge/PRO setup, use longer timeouts and serialized graph execution:
export ROUTER_PLANNER_MODEL=gemma4:26b
export ROUTER_JUDGE_MODEL=gemma4:26b
export ROUTER_PRO_MODEL=gemma4:26b
export ROUTER_FLASH_MODEL=qwen2.5:7b
export ROUTER_JUDGE_TIMEOUT=600
export ROUTER_MAX_CONCURRENCY=1
When the user says "走 super-router", "use super-router", or asks for router analysis, invoke the script from the OpenClaw skill checkout. Do not assume shell startup files have already exported ROUTER_* overrides; pass them inline or through your shell tool's environment support.
bash workdir:~/.openclaw/skills/super-router command:"ROUTER_PLANNER_MODEL=google-gemini-cli/gemini-3-pro-preview ROUTER_JUDGE_MODEL=gemma4:26b ROUTER_JUDGE_TIMEOUT=600 /opt/homebrew/Caskroom/miniforge/base/bin/python scripts/router.py '分析 K8s YAML 错误并重写配置'"
bash workdir:~/.openclaw/skills/super-router background:true command:"/opt/homebrew/Caskroom/miniforge/base/bin/python scripts/router.py --stream 'Your complex task'"
For agents that struggle with non-ASCII arguments:
# Normalize task to short ASCII English, then pass as argument
bash workdir:~/.openclaw/skills/super-router command:"/opt/homebrew/Caskroom/miniforge/base/bin/python scripts/router.py 'Analyze K8s YAML errors and fix'"
# Or via env var
bash workdir:~/.openclaw/skills/super-router command:"ROUTER_TASK='Your complex task description' /opt/homebrew/Caskroom/miniforge/base/bin/python scripts/router.py"
For long-running jobs, use OpenClaw background execution and inspect the session until it completes:
process action:poll sessionId:<session-id>
process action:log sessionId:<session-id>
When the run completes, summarize the actual route taken, whether the planner or judge fell back, whether FLASH escalated to PRO, and the final report's recommended next action.
| Variable | Purpose | Default |
|---|---|---|
ROUTER_PLANNER_MODEL | Task decomposition model | gemma4:26b |
ROUTER_JUDGE_MODEL | Complexity scoring model | llama3.1:8b |
ROUTER_PRO_MODEL | Heavy reasoning executor | google-gemini-cli/gemini-3-pro-preview |
ROUTER_FLASH_MODEL | Fast executor | google-gemini-cli/flash |
ROUTER_PRO_FALLBACK_MODELS | Comma-separated PRO fallback list | None |
ROUTER_FLASH_FALLBACK_MODELS | Comma-separated FLASH fallback list | None |
ROUTER_FLASH_RETRY_BUDGET | Max FLASH retries before escalation | 1 |
ROUTER_RECURSION_LIMIT | Python recursion limit | 128 |
ROUTER_JUDGE_TIMEOUT | Timeout for Judge node LLM calls (seconds) | 300 (up to 6000 for extremely complex tasks with large models) |
ROUTER_MAX_CONCURRENCY | Max concurrent LangGraph branches for judge and executor fanout. Essential for multi-entity atomic tasks; set to 1 for local 26B+ Judge models or constrained hardware. | Auto (1 for large Judge models) |
ROUTER_GEMINI_CLI | Path to Gemini CLI (if using instead of Ollama) | /opt/homebrew/bin/gemini |
ROUTER_OLLAMA_URL | Ollama API endpoint | http://localhost:11434/api/generate |
ROUTER_FINALIZER_TIMEOUT | Timeout for the final reporting synthesis (seconds). Essential to set high for complex tasks to avoid timeouts during context assembly. | 6000 |
ROUTER_DEBUG | Print raw planner/judge/Ollama diagnostic snippets | Off |
For large models (20B+ like gemma4:26b):
ROUTER_PLANNER_MODEL=gemma4:26b with ROUTER_JUDGE_MODEL=llama3.1:8bROUTER_JUDGE_MODEL=gemma4:26b, set ROUTER_JUDGE_TIMEOUT=600 and keep ROUTER_MAX_CONCURRENCY=1--stream, background execution, and session polling/log inspection for large Planner/Judge runs.The Judge scores each subtask on:
complexity_score is the sum of reasoning_depth + code_change_scope + ambiguity + risk. io_heaviness influences routing but does not add to that score directly.
| Condition | Route |
|---|---|
complexity_score >= 5 | PRO |
complexity_score <= 2 | FLASH |
| Summary-like task (no deep work) | FLASH |
| High-risk incident diagnosis | PRO |
| High-risk evidence gathering | PRO |
| High-risk decision/rollback evaluation | PRO |
| Boundary case + low confidence | PRO (safe default) |
The router applies automatic adjustments:
reasoning_depth, risk, ambiguityWhen FLASH execution fails or produces questionable output:
Classify failure type:
infra_transient: timeout, network, rate limit, service unavailablecapability_quality: "need more info", empty output, too short, repeated taskDecision:
ROUTER_FLASH_RETRY_BUDGET)Post-execution verification:
Final report generation follows:
FLASH finalizer -> (if fails) -> PRO finalizer -> (if fails) -> Deterministic template
{
"task": "original task string",
"planner_model": "model name used for planning",
"judge_model": "model name used for complexity scoring",
"pro_model": "primary PRO model",
"flash_model": "primary FLASH model",
"planned_subtasks": [{"desc": "..."}],
"subtasks": [
{
"desc": "...",
"model": "PRO|FLASH",
"assessment": {
"scores": {"reasoning_depth": 2, "code_change_scope": 1, "ambiguity": 1, "risk": 1, "io_heaviness": 0},
"complexity_score": 5,
"suggested_route": "PRO",
"final_route": "PRO",
"confidence": 0.85,
"reason": "...",
"judge_source": "llm|heuristic"
}
}
],
"results": [
{
"step": 1,
"planned_route": "PRO",
"route": "PRO",
"model_name": "qwen3",
"desc": "...",
"output": "...",
"status": "executed|executed_via_provider_fallback|flash_retry_exhausted|executor_fallback",
"attempt_count": 1,
"retry_count": 0,
"escalated_from_flash": false,
"used_provider_fallback": false,
"flash_review": {"decision": "record", "failure_type": "none", "reason": "..."},
"attempt_log": ["..."]
}
],
"final_report": "...",
"finalizer_outcome": {
"route": "FLASH|PRO|DETERMINISTIC",
"model_name": "...",
"status": "...",
"used_provider_fallback": false,
"reason": "...",
"attempt_log": ["..."]
}
}
| File | Purpose |
|---|---|
scripts/router.py | Main LangGraph router script |
SKILL.md | This documentation |
ROUTER_PLANNER_MODEL=gemma4:26b, but set ROUTER_JUDGE_MODEL=llama3.1:8b.ROUTER_JUDGE_MODEL=gemma4:26b, ROUTER_JUDGE_TIMEOUT=600, and ROUTER_MAX_CONCURRENCY=1; expect much longer runs.--stream and increase the terminal/process timeout if the Planner itself may take longer than 60s.ROUTER_JUDGE_TIMEOUT=300 or higher only when intentionally using a 20B+ Judge.ROUTER_PLANNER_MODEL=google-gemini-cli/gemini-3-pro-preview.--stream plus a longer terminal/process timeout, or choose a smaller planner model.ollama serve output for errorsROUTER_FLASH_MODEL to a larger model--stream to monitor real-time progress and ensure ROUTER_JUDGE_TIMEOUT and terminal timeouts are sufficiently high to prevent external process termination.ROUTER_PLANNER_MODEL=gemma4:31b (if you have the patience for 90s+ waits)