Error Recovery

Other

Error Recovery - Strategies for handling failures gracefully

Install

openclaw skills install error-recovery

Error Recovery - Handling Failures Gracefully

OpenClaw Error Types

ErrorCauseRecovery
429 Rate LimitToo many requestsWait + retry with backoff
402 BillingAPI credits exhaustedSwitch model or wait
503 OverloadedProvider busyWait + retry
Context OverflowContext window fullTrigger compaction
Role OrderingCorrupted sessionReset session
Transient HTTPNetwork issueAuto-retry (built-in)

Error Handling Patterns

Pattern 1: Retry with Backoff

async function retryWithBackoff(fn, maxAttempts = 3) {
  for (let i = 0; i < maxAttempts; i++) {
    try {
      return await fn();
    } catch (err) {
      if (isRetryable(err) && i < maxAttempts - 1) {
        await sleep(1000 * Math.pow(2, i)); // 1s, 2s, 4s
        continue;
      }
      throw err;
    }
  }
}

Pattern 2: Graceful Degradation

async function withFallback(primary, fallback) {
  try {
    return await primary();
  } catch (err) {
    console.warn("Primary failed, trying fallback:", err.message);
    return await fallback();
  }
}

// Usage
const result = await withFallback(
  () => callExpensiveModel(msg),
  () => callCheapModel(msg)
);

Pattern 3: Circuit Breaker

class CircuitBreaker {
  constructor(fn, threshold = 3) {
    this.fn = fn;
    this.failures = 0;
    this.threshold = threshold;
    this.state = "closed"; // open/closed/half-open
  }

  async execute() {
    if (this.state === "open") {
      throw new Error("Circuit breaker open");
    }
    try {
      const result = await this.fn();
      this.failures = 0;
      this.state = "closed";
      return result;
    } catch (err) {
      this.failures++;
      if (this.failures >= this.threshold) {
        this.state = "open";
      }
      throw err;
    }
  }
}

Tool Failure Handling

Tools in OpenClaw are non-fatal — if a tool fails, the agent continues:

// In agent-runner, tool errors are caught but don't stop execution
.catch((err) => {
  log(`Tool ${toolName} failed: ${err.message}`);
  // Continue execution
});

Session Corruption Recovery

If session becomes corrupted:

  1. Detection: Role ordering errors, malformed messages
  2. Action: Session auto-reset by OpenClaw
  3. Recovery: Fresh session, context preserved via MEMORY.md

Built-in Recovery Mechanisms

MechanismTriggerAction
Auto-retry408/429/5xxExponential backoff
Context overflowToken limitTrigger compaction
Session corruptionRole errorReset session
Model switchpersistent failureFallback model

Best Practices

  1. Don't suppress errors silently — log them
  2. Prefer retries for transient failures — network/timeout
  3. Reset for corruption — don't try to fix corrupted state
  4. Use fallback models — cheap → expensive
  5. Keep MEMORY.md updated — for recovery after reset