Install
openclaw skills install model-fallbackMulti-model automatic fallback system. Monitors model availability and automatically falls back to backup models when the primary model fails. Supports MiniMax, Kimi, Zhipu and other OpenAI-compatible APIs. Use when: (1) Primary model API is unavailable, (2) Model response time is too slow, (3) Rate limit exceeded, (4) Need to optimize costs by using cheaper models for simple tasks.
openclaw skills install model-fallbackMulti-model automatic fallback system for AI agents
This skill provides automatic model fallback functionality for OpenClaw agents. When the primary model fails (unavailable, slow, or rate-limited), it automatically switches to backup models in a predefined priority order.
| Provider | Model | Context | Use Case |
|---|---|---|---|
| MiniMax | M2.5 | 200K | Primary (reasoning) |
| MiniMax | M2.1 | 200K | Backup |
| Kimi | K2.5 | 256K | Long documents |
| Kimi | K2 | 128K | Standard |
| Zhipu | GLM-4-Air | 128K | Low cost |
| Zhipu | GLM-4-Flash | 1M | High volume |
{
"fallback_chain": [
{
"provider": "minimax-portal",
"model": "MiniMax-M2.5",
"priority": 1,
"timeout": 30,
"max_retries": 3
},
{
"provider": "moonshot",
"model": "kimi-k2.5",
"priority": 2,
"timeout": 30,
"max_retries": 2
},
{
"provider": "zhipu",
"model": "glm-4-air",
"priority": 3,
"timeout": 20,
"max_retries": 2
}
]
}
| Variable | Required | Description |
|---|---|---|
MODEL_FALLBACK_ENABLED | No | Enable/disable fallback (default: true) |
MODEL_FALLBACK_LOG_LEVEL | No | Log level: debug, info, warn, error |
The skill automatically handles model failures. No explicit calls needed.
# Trigger a model call (fallback happens automatically on failure)
# Force fallback to next model
/scripts/model-fallback.sh --force-next
# Check current model status
/scripts/model-fallback.sh --status
# Reset to primary model
/scripts/model-fallback.sh --reset
Edit config.json to customize the fallback chain:
{
"fallback_chain": [
{"provider": "...", "model": "...", "priority": 1}
],
"health_check": {
"enabled": true,
"interval_seconds": 300
}
}
1. User makes request with primary model
2. Model call fails (error, timeout, rate limit)
3. Skill detects failure
4. Wait 3 seconds (debounce)
5. Switch to next model in chain
6. Retry request with new model
7. If successful, return result
8. If failed, repeat steps 4-7
9. If all models fail, return error with details
| Trigger | Condition | Action |
|---|---|---|
| API Unavailable | Connection timeout | Fallback |
| Rate Limit | 429 response | Fallback + wait |
| Slow Response | > timeout seconds | Fallback |
| Invalid Response | Parse error | Fallback |
| Auth Error | 401/403 response | Log + stop |
Logs are written to:
~/.openclaw/logs/model-fallback.log[2026-02-27 14:00:00] [INFO] Primary model MiniMax-M2.5 called
[2026-02-27 14:00:05] [WARN] Model failed: rate limit exceeded
[2026-02-27 14:00:05] [INFO] Falling back to Kimi K2.5
[2026-02-27 14:00:10] [INFO] Fallback successful
Use cheaper models for simple tasks:
{
"task_routing": {
"simple_query": ["glm-4-air", "glm-4-flash"],
"complex_reasoning": ["MiniMax-M2.5", "kimi-k2.5"],
"long_context": ["kimi-k2.5", "MiniMax-M2.1"]
}
}
Add to openclaw.json:
{
"models": {
"mode": "merge",
"fallback": {
"enabled": true,
"config": "~/.openclaw/skills/model-fallback/config.json"
}
}
}
Integrate with system health monitoring:
# Check model health
curl http://localhost:18789/api/models/health
echo $MODEL_FALLBACK_ENABLEDls ~/.openclaw/skills/model-fallback/config.jsontail -f ~/.openclaw/logs/model-fallback.logUser: "Hello"
System: Using MiniMax-M2.5...
System: Rate limited, switching to Kimi K2.5...
System: Response from Kimi K2.5: "Hello! How can I help?"
User: "What is 2+2?"
System: Routing to glm-4-air (low cost)...
System: Response: "2+2=4"
User: "Summarize this 100-page PDF"
System: Detected long context requirement
System: Routing to Kimi K2.5 (256K context)...
System: Processing...
MIT
CC (AI Assistant)
1.0.0