Rate Limiting

Automation

Deep rate limiting workflow—identifying actors and resources, choosing algorithms, distributed vs local limits, client UX (headers, retries), and abuse detection. Use when protecting APIs, gateways, or multi-tenant SaaS workloads.

Install

openclaw skills install @codekungfu/rate-limiting

Rate Limiting (Deep Workflow)

Rate limits balance fairness, availability, and abuse prevention. Design explicitly: who is throttled, what resource is limited, and how clients should back off.

When to Offer This Workflow

Trigger conditions:

Protecting public APIs, auth endpoints, or expensive operations
Multi-tenant “noisy neighbor” isolation
Retry storms after incidents causing cascading 429/502

Initial offer:

Use six stages: (1) threat & fairness model, (2) dimensions & keys, (3) algorithms & config, (4) distributed enforcement, (5) client protocol & UX, (6) observability & tuning). Confirm enforcement layer (API gateway vs app middleware vs edge).

Stage 1: Threat & Fairness Model

Goal: Distinguish legitimate bursts (batch jobs, mobile retries) from abuse; align limits with product tiers and SLAs.

Exit condition: Written policy: free vs paid limits, partner caps, burst allowances.

Stage 2: Dimensions & Keys

Goal: Choose stable limit keys: authenticated user id > API key > IP (with shared-NAT caveats).

Practices

Per-tenant and global limits; separate expensive routes (exports, search)

Stage 3: Algorithms & Config

Goal: Token bucket / leaky bucket for smooth bursts; sliding window for strict per-minute caps; consider concurrency limits separately from request rate.

Stage 4: Distributed Enforcement

Goal: Central store (Redis, etc.) with atomic increments; handle multi-region (sticky routing vs shared counters); mind clock skew.

Stage 5: Client Protocol & UX

Goal: Consistent 429 responses with Retry-After; document exponential backoff + jitter; optional X-RateLimit-* headers for transparency.

Stage 6: Observability & Tuning

Goal: Metrics on throttles by route and actor class; alerts on abnormal deny spikes (attack vs misconfigured client).

Final Review Checklist

Policy matches tiers and fairness goals
Limit keys stable and hard to spoof
Algorithm matches burst vs sustained semantics
Distributed correctness considered
Client-facing 429 behavior documented
Metrics and tuning loop defined

Tips for Effective Guidance

Coordinate with authentication—anonymous IP limits are coarse.
Don’t throttle health checks in ways that break monitors.
GraphQL: consider query cost / depth limits, not only HTTP count.
WebSockets: separate connection caps from message rate limits.

Handling Deviations

Edge/CDN: limits may differ from origin—document both layers.