Install
openclaw skills install agent-debuggerDebug AI agent issues systematically. Covers tool failures, infinite loops, context overflow, rate limits, and performance bottlenecks. Use when agents misbehave, loop infinitely, fail tools, hit limits, or produce unexpected outputs. Triggers on "debug", "fix agent", "agent stuck", "agent looping", "tool failed", "rate limit".
openclaw skills install agent-debuggerSystematic debugging for AI agent issues. When your agent misbehaves, this skill helps identify and fix the problem.
Symptoms:
Diagnosis:
Agent log shows:
- Same tool called 10+ times
- Same output format repeated
- No progress between iterations
Fixes:
Add iteration limit:
{
"maxIterations": 5,
"onLimit": "ask_user"
}
Add explicit stop condition:
In your instructions, add:
"If you've tried the same approach 3 times without success, stop and ask the user for guidance."
Symptoms:
Diagnosis:
Check:
- Tool exists in available_tools
- Parameters match tool schema
- Tool has required permissions
- Rate limits not exceeded
Fixes:
Validate parameters first:
# Before calling tool
required_params = tool.get("required", [])
for param in required_params:
if param not in args:
raise ValueError(f"Missing required parameter: {param}")
Add retry logic:
{
"retries": 3,
"retryDelay": 1000,
"retryOn": ["rate_limit", "timeout", "5xx"]
}
Symptoms:
Diagnosis:
Check context window:
- Current tokens vs max tokens
- Number of messages in history
- Size of file contents loaded
Fixes:
Use memory efficiently:
- Load only relevant files
- Use offset/limit for large files
- Summarize long conversations
- Clear old context periodically
Compress context:
# Instead of full file
content = read("file.txt", offset=1, limit=100)
# Use memory_search for specific info
results = memory_search("important decision")
Symptoms:
Diagnosis:
Check:
- API rate limits (requests per minute/hour)
- Token limits (tokens per minute)
- Concurrent request limits
- Time until reset
Fixes:
Add backoff:
import time
import random
def call_with_backoff(func, max_retries=5):
for attempt in range(max_retries):
try:
return func()
except RateLimitError as e:
wait = (2 ** attempt) + random.random()
time.sleep(wait)
raise Exception("Max retries exceeded")
Queue requests:
from queue import Queue
from threading import Thread
request_queue = Queue()
def process_queue():
while True:
task = request_queue.get()
result = execute(task)
request_queue.task_done()
time.sleep(0.1) # Rate limit: 10 req/s
Symptoms:
Diagnosis:
Check:
- MEMORY.md exists
- memory/ directory exists
- Files have correct permissions
- Memory loaded at startup
Fixes:
Verify memory setup:
ls -la ~/.openclaw/workspace/
# Should show:
# MEMORY.md
# memory/
Add memory to instructions:
Before answering anything about prior work, decisions, dates, people, or todos:
run memory_search on MEMORY.md + memory/*.md
Symptoms:
Diagnosis:
Check:
- User permissions
- File permissions
- Tool policies
- Sandbox restrictions
Fixes:
Check file permissions:
ls -la /path/to/file
chmod 600 ~/.openclaw/workspace/sensitive.json
Review tool policies:
{
"tools": {
"exec": {
"security": "ask", // or "allowlist" or "full"
"ask": "on-miss" // or "always" or "off"
}
}
}
Symptoms:
Diagnosis:
Profile the agent:
- Time each tool call
- Count tokens used
- Measure context growth
- Identify bottlenecks
Fixes:
Optimize context:
# Instead of loading entire file
content = read("large_file.txt", limit=50)
# Use targeted search
results = memory_search("specific topic")
Reduce tool calls:
# Bad: Multiple calls
file1 = read("file1.txt")
file2 = read("file2.txt")
file3 = read("file3.txt")
# Good: Parallel or combined
files = read(["file1.txt", "file2.txt", "file3.txt"])
1. Document exact steps to trigger issue
2. Note expected vs actual behavior
3. Check if issue is consistent or intermittent
4. Try with minimal example
1. Disable other skills
2. Reduce context to minimum
3. Simplify task
4. Test each component separately
1. Check logs (if available)
2. Review tool outputs
3. Examine context window
4. Verify configuration
1. Apply fix
2. Test fix
3. Document fix
4. Update instructions if needed
1. Add guardrails
2. Update error handling
3. Add logging
4. Document in memory
# If you have access to session tools
status = session_status()
print(f"Model: {status['model']}")
print(f"Tokens used: {status['usage']['total_tokens']}")
print(f"Reasoning: {status['reasoning']}")
If agent is stuck:
1. Start new session
2. Load only essential memory
3. Re-approach task fresh
{
"thinking": "verbose",
"reasoning": "on"
}
This shows internal reasoning, helping identify where logic fails.
| Error | Cause | Fix |
|---|---|---|
context_length_exceeded | Too much context | Compress, summarize, limit |
rate_limit_exceeded | Too many requests | Backoff, queue, wait |
tool_not_found | Wrong tool name | Check spelling, install skill |
permission_denied | Insufficient access | Check permissions, ask user |
invalid_parameters | Wrong params | Validate against schema |
timeout | Slow response | Increase timeout, optimize |
memory_not_found | No memory files | Create MEMORY.md |
# Always check before acting
if not os.path.exists(file):
return "File not found"
try:
result = risky_operation()
except ExpectedError:
handle_error()
In agent instructions:
"Track your progress. After each major step, note what you've done and what's next."
For long tasks:
- Save progress periodically
- Document current state
- Allow resuming from checkpoint
# Add to critical operations
log(f"Starting operation: {operation}")
log(f"Parameters: {params}")
log(f"Result: {result}")
log(f"Error: {error}")
Ask the user when: