LLM Supervisor

Graceful rate limit handling with Ollama fallback. Notifies on rate limits, offers local model switch with confirmation for code tasks.

Audits

Pass

ClawScanReview

Agentic behavior and permission review.

Static analysisPass

Pattern checks against bundled files.

VirusTotalPass

Multi-engine malware detections and file reputation.

Install

openclaw skills install llm-supervisor

LLM Supervisor 🔮

Handles rate limits and model fallbacks gracefully.

Behavior

On Rate Limit / Overload Errors

When I encounter rate limits or overload errors from cloud providers (Anthropic, OpenAI):

Tell the user immediately — Don't silently fail or retry endlessly
Offer local fallback — Ask if they want to switch to Ollama
Wait for confirmation — Never auto-switch for code generation tasks

Confirmation Required

Before using local models for code generation, ask:

"Cloud is rate-limited. Switch to local Ollama (qwen2.5:7b)? Reply 'yes' to confirm."

For simple queries (chat, summaries), can switch without confirmation if user previously approved.

Commands

`/llm status`

Report current state:

Which provider is active (cloud/local)
Ollama availability and models
Recent rate limit events

`/llm switch local`

Manually switch to Ollama for the session.

`/llm switch cloud`

Switch back to cloud provider.

Using Ollama

# Check available models
ollama list

# Run a query
ollama run qwen2.5:7b "your prompt here"

# For longer prompts, use stdin
echo "your prompt" | ollama run qwen2.5:7b

Installed Models

Check with ollama list. Configured default: qwen2.5:7b

State Tracking

Track in memory during session:

currentProvider: "cloud" | "local"
lastRateLimitAt: timestamp or null
localConfirmedForCode: boolean

Reset to cloud at session start.