{"skill":{"slug":"ralph-loops","displayName":"Ralph Loops","summary":"Runs autonomous iterative AI loops for requirements, planning, or building phases using structured prompts and fresh context per iteration.","description":"# Ralph Loops Skill\n\n> **First time?** Read [SETUP.md](./SETUP.md) first to install dependencies and verify your setup.\n\nAutonomous AI agent loops for iterative development. Based on Geoffrey Huntley's Ralph Wiggum technique, as documented by Clayton Farr.\n\n**Script:** `skills/ralph-loops/scripts/ralph-loop.mjs`\n**Dashboard:** `skills/ralph-loops/dashboard/` (run with `node server.mjs`)\n**Templates:** `skills/ralph-loops/templates/`\n**Archive:** `~/clawd/logs/ralph-archive/`\n\n---\n\n## ⚠️ Known Issues\n\n### Claude Code Version Compatibility\n\n**Claude Code 2.1.29 has a critical bug** that spawns orphaned sub-agents consuming 99% CPU. Iterations fail with \"exit code null\" on first run.\n\n**Fix:** Downgrade to 2.1.25:\n```bash\nnpm install -g @anthropic-ai/claude-code@2.1.25\n```\n\n**Verify:**\n```bash\nclaude --version  # Should show 2.1.25\n```\n\nThis was discovered 2026-02-01. Check if newer versions fix the issue before upgrading.\n\n---\n\n## ⚠️ Don't Block the Conversation!\n\nWhen running a Ralph loop, **don't monitor it synchronously**. The loop runs as a separate Claude CLI process — you can keep chatting.\n\n**❌ Wrong (blocks conversation):**\n```\nStart loop → sleep 60 → poll → sleep 60 → poll → ... (6 minutes of silence)\n```\n\n**✅ Right (stays responsive):**\n```\nStart loop → \"It's running, I'll check periodically\" → keep chatting → check on heartbeats\n```\n\n**How to monitor without blocking:**\n1. Start the loop with `node ralph-loop.mjs ...` (runs in background)\n2. Tell human: \"Loop running. I'll check progress periodically or you can ask.\"\n3. Check via `process poll <sessionId>` when asked or during heartbeats\n4. Use the dashboard at http://localhost:3939 for real-time visibility\n\n**The loop is autonomous** — that's the whole point. Don't babysit it at the cost of ignoring your human.\n\n---\n\n## Trigger Phrases\n\nWhen human says:\n\n| Phrase | Action |\n|--------|--------|\n| **\"Interview me about system X\"** | Start Phase 1 requirements interview |\n| **\"Start planning system X\"** | Run `./loop.sh plan` (needs specs first) |\n| **\"Start building system X\"** | Run `./loop.sh build` (needs plan first) |\n| **\"Ralph loop over X\"** | **ASK which phase** (see below) |\n\n### When Human Says \"Ralph Loop\" — Clarify the Phase!\n\nDon't assume which phase. Ask:\n\n> \"Which type of Ralph loop are we doing?\n> \n> 1️⃣ **Interview** — I'll ask you questions to build specs (Phase 1)\n> 2️⃣ **Planning** — I'll iterate on an implementation plan (Phase 2)  \n> 3️⃣ **Building** — I'll implement from a plan, one task per iteration (Phase 3)\n> 4️⃣ **Generic** — Simple iterative refinement on a single topic\"\n\n**Then proceed based on their answer:**\n\n| Choice | Action |\n|--------|--------|\n| Interview | Use `templates/requirements-interview.md` protocol |\n| Planning | Need specs first → run planning loop with `PROMPT_plan.md` |\n| Building | Need plan first → run build loop with `PROMPT_build.md` |\n| Generic | Create prompt file, run `ralph-loop.mjs` directly |\n\n### Generic Ralph Loop Flow (Phase 4)\n\nFor simple iterative refinement (not full system builds):\n\n1. **Clarify the task** — What exactly should be improved/refined?\n2. **Create a prompt file** — Save to `/tmp/ralph-prompt-<task>.md`\n3. **Set completion criteria** — What signals \"done\"?\n4. **Run the loop:**\n   ```bash\n   node skills/ralph-loops/scripts/ralph-loop.mjs \\\n     --prompt \"/tmp/ralph-prompt-<task>.md\" \\\n     --model opus \\\n     --max 10 \\\n     --done \"RALPH_DONE\"\n   ```\n5. **Or spawn as sub-agent** for long-running tasks\n\n---\n\n## Core Philosophy\n\n> \"Human roles shift from 'telling the agent what to do' to 'engineering conditions where good outcomes emerge naturally through iteration.\"\n> — Clayton Farr\n\nThree principles drive everything:\n\n1. **Context is scarce** — With ~176K usable tokens from a 200K window, keep each iteration lean\n2. **Plans are disposable** — A drifting plan is cheaper to regenerate than salvage\n3. **Backpressure beats direction** — Engineer environments where wrong outputs get rejected automatically\n\n---\n\n## Three-Phase Workflow\n\n```\n┌─────────────────────────────────────────────────────────────────────┐\n│  Phase 1: REQUIREMENTS                                              │\n│  Human + LLM conversation → JTBD → Topics → specs/*.md              │\n├─────────────────────────────────────────────────────────────────────┤\n│  Phase 2: PLANNING                                                  │\n│  Gap analysis (specs vs code) → IMPLEMENTATION_PLAN.md              │\n├─────────────────────────────────────────────────────────────────────┤\n│  Phase 3: BUILDING                                                  │\n│  One task per iteration → fresh context → backpressure → commit     │\n└─────────────────────────────────────────────────────────────────────┘\n```\n\n### Phase 1: Requirements (Talk to Human)\n\n**Goal:** Understand what to build BEFORE building it.\n\nThis is the most important phase. Use structured conversation to:\n\n1. **Identify Jobs to Be Done (JTBD)**\n   - What user need or outcome are we solving?\n   - Not features — outcomes\n\n2. **Break JTBD into Topics of Concern**\n   - Each topic = one distinct aspect/component\n   - Use the \"one sentence without 'and'\" test\n   - ✓ \"The color extraction system analyzes images to identify dominant colors\"\n   - ✗ \"The user system handles authentication, profiles, and billing\" → 3 topics\n\n3. **Create Specs for Each Topic**\n   - One markdown file per topic in `specs/`\n   - Capture requirements, acceptance criteria, edge cases\n\n**Template:** `templates/requirements-interview.md`\n\n### Phase 2: Planning (Gap Analysis)\n\n**Goal:** Create a prioritized task list without implementing anything.\n\nUses `PROMPT_plan.md` in the loop:\n- Study all specs\n- Study existing codebase\n- Compare specs vs code (gap analysis)\n- Generate `IMPLEMENTATION_PLAN.md` with prioritized tasks\n- **NO implementation** — planning only\n\nUsually completes in 1-2 iterations.\n\n### Phase 3: Building (One Task Per Iteration)\n\n**Goal:** Implement tasks one at a time with fresh context.\n\nUses `PROMPT_build.md` in the loop:\n1. Read `IMPLEMENTATION_PLAN.md`\n2. Pick the most important task\n3. Investigate codebase (don't assume not implemented)\n4. Implement\n5. Run validation (backpressure)\n6. Update plan, commit\n7. Exit → fresh context → next iteration\n\n**Key insight:** One task per iteration keeps context lean. The agent stays in the \"smart zone\" instead of accumulating cruft.\n\n**Why fresh context matters:**\n- **No accumulated mistakes** — Each iteration starts clean; previous errors don't compound\n- **Full context budget** — 200K tokens for THIS task, not shared with finished work\n- **Reduced hallucination** — Shorter contexts = more grounded responses\n- **Natural checkpoints** — Each commit is a save point; easy to revert single iterations\n\n---\n\n## File Structure\n\n```\nproject/\n├── loop.sh                    # Ralph loop script\n├── PROMPT_plan.md             # Planning mode instructions\n├── PROMPT_build.md            # Building mode instructions  \n├── AGENTS.md                  # Operational guide (~60 lines max)\n├── IMPLEMENTATION_PLAN.md     # Prioritized task list (generated)\n└── specs/                     # Requirement specs\n    ├── topic-a.md\n    ├── topic-b.md\n    └── ...\n```\n\n### File Purposes\n\n| File | Purpose | Who Creates |\n|------|---------|-------------|\n| `specs/*.md` | Source of truth for requirements | Human + Phase 1 |\n| `PROMPT_plan.md` | Instructions for planning mode | Copy from template |\n| `PROMPT_build.md` | Instructions for building mode | Copy from template |\n| `AGENTS.md` | Build/test/lint commands | Human + Ralph |\n| `IMPLEMENTATION_PLAN.md` | Task list with priorities | Ralph (Phase 2) |\n\n### Project Organization (Systems)\n\nFor Clawdbot systems, each Ralph project lives in `<workspace>/systems/<name>/`:\n\n```\nsystems/\n├── health-tracker/           # Example system\n│   ├── specs/\n│   │   ├── daily-tracking.md\n│   │   └── test-scheduling.md\n│   ├── PROMPT_plan.md\n│   ├── PROMPT_build.md\n│   ├── AGENTS.md\n│   ├── IMPLEMENTATION_PLAN.md  # ← exists = past Phase 1\n│   └── src/\n└── activity-planner/\n    ├── specs/                  # ← empty = still in Phase 1\n    └── ...\n```\n\n### Phase Detection (Auto)\n\nDetect current phase by checking what files exist:\n\n| What Exists | Current Phase | Next Action |\n|-------------|---------------|-------------|\n| Nothing / empty `specs/` | Phase 1: Requirements | Run requirements interview |\n| `specs/*.md` but no `IMPLEMENTATION_PLAN.md` | Ready for Phase 2 | Run `./loop.sh plan` |\n| `specs/*.md` + `IMPLEMENTATION_PLAN.md` | Phase 2 or 3 | Review plan, run `./loop.sh build` |\n| Plan shows all tasks complete | Done | Archive or iterate |\n\n**Quick check:**\n```bash\n# What phase are we in?\n[ -z \"$(ls specs/ 2>/dev/null)\" ] && echo \"Phase 1: Need specs\" && exit\n[ ! -f IMPLEMENTATION_PLAN.md ] && echo \"Phase 2: Need plan\" && exit\necho \"Phase 3: Ready to build (or done)\"\n```\n\n---\n\n## JTBD Breakdown\n\nThe hierarchy matters:\n\n```\nJTBD (Job to Be Done)\n└── Topic of Concern (1 per spec file)\n    └── Tasks (many per topic, in IMPLEMENTATION_PLAN.md)\n```\n\n**Example:**\n- **JTBD:** \"Help designers create mood boards\"\n- **Topics:**\n  - Image collection → `specs/image-collection.md`\n  - Color extraction → `specs/color-extraction.md`\n  - Layout system → `specs/layout-system.md`\n  - Sharing → `specs/sharing.md`\n- **Tasks:** Each spec generates multiple implementation tasks\n\n### Topic Scope Test\n\n> Can you describe the topic in one sentence without \"and\"?\n\nIf you need \"and\" or \"also\", it's probably multiple topics. Split it.\n\n**When to split:**\n- Multiple verbs in the description → separate topics\n- Different user personas involved → separate topics\n- Could be implemented by different teams → separate topics\n- Has its own failure modes → probably its own topic\n\n**Example split:**\n```\n❌ \"User management handles registration, authentication, profiles, and permissions\"\n\n✅ Split into:\n   - \"Registration creates new user accounts from email/password\"\n   - \"Authentication verifies user identity via login flow\"  \n   - \"Profiles let users view and edit their information\"\n   - \"Permissions control what actions users can perform\"\n```\n\n**Counter-example (don't split):**\n```\n✅ Keep together:\n   \"Color extraction analyzes images and returns dominant color palettes\"\n   \n   Why: \"analyzes\" and \"returns\" are steps in one operation, not separate concerns.\n```\n\n---\n\n## Backpressure Mechanisms\n\nAutonomous loops converge when wrong outputs get rejected. Three layers:\n\n### 1. Downstream Gates (Hard)\nTests, type-checking, linting, build validation. Deterministic.\n```markdown\n# In AGENTS.md\n## Validation\n- Tests: `npm test`\n- Typecheck: `npm run typecheck`\n- Lint: `npm run lint`\n```\n\n### 2. Upstream Steering (Soft)\nExisting code patterns guide the agent. It discovers conventions through exploration.\n\n### 3. LLM-as-Judge (Subjective)\nFor subjective criteria (tone, UX, aesthetics), use another LLM call with binary pass/fail.\n\n> Start with hard gates. Add LLM-as-judge for subjective criteria only after mechanical backpressure works.\n\n---\n\n## Prompt Structure\n\nGeoffrey's prompts follow a numbered pattern:\n\n| Section | Purpose |\n|---------|---------|\n| 0a-0d | **Orient:** Study specs, source, current plan |\n| 1-4 | **Main instructions:** What to do this iteration |\n| 999+ | **Guardrails:** Invariants (higher number = more critical) |\n\n### The Numbered Guardrails Pattern\n\nGuardrails use escalating numbers (99999, 999999, 9999999...) to signal priority:\n\n```markdown\n99999. Important: Capture the why in documentation.\n\n999999. Important: Single sources of truth, no migrations.\n\n9999999. Create git tags after successful builds.\n\n99999999. Add logging if needed to debug.\n\n999999999. Keep IMPLEMENTATION_PLAN.md current.\n```\n\n**Why this works:**\n1. **Visual prominence** — Large numbers stand out, harder to skip\n2. **Implicit priority** — More 9s = more critical (like DEFCON levels in reverse)\n3. **No collisions** — Sparse numbering lets you insert new rules without renumbering\n4. **Mnemonic** — Claude treats these as invariants, not suggestions\n\n**The \"Important:\" prefix** is deliberate — it triggers Claude's attention.\n\n### Key Language Patterns\n\nUse Geoffrey's specific phrasing — it matters:\n\n- \"study\" (not \"read\" or \"look at\")\n- \"don't assume not implemented\" (critical!)\n- \"using parallel subagents\" / \"up to N subagents\"\n- \"only 1 subagent for build/tests\" (backpressure control)\n- \"Ultrathink\" (deep reasoning trigger)\n- \"capture the why\"\n- \"keep it up to date\"\n- \"resolve them or document them\"\n\n---\n\n## Quick Start\n\n### 1. Set Up Project Structure\n\n```bash\nmkdir -p myproject/specs\ncd myproject\ngit init  # Ralph expects git for commits\n\n# Copy templates\ncp .//templates/PROMPT_plan.md .\ncp .//templates/PROMPT_build.md .\ncp .//templates/AGENTS.md .\ncp .//templates/loop.sh .\nchmod +x loop.sh\n```\n\n### 2. Customize Templates (Required!)\n\n**PROMPT_plan.md** — Replace `[PROJECT_GOAL]` with your actual goal:\n```markdown\n# Before:\nULTIMATE GOAL: We want to achieve [PROJECT_GOAL].\n\n# After:\nULTIMATE GOAL: We want to achieve a fully functional mood board app with image upload and color extraction.\n```\n\n**PROMPT_build.md** — Adjust source paths if not using `src/`:\n```markdown\n# Before:\n0c. For reference, the application source code is in `src/*`.\n\n# After:\n0c. For reference, the application source code is in `lib/*`.\n```\n\n**AGENTS.md** — Update build/test/lint commands for your stack.\n\n### 3. Phase 1: Requirements Gathering (Don't Skip!)\n\nThis phase happens WITH the human. Use the interview template:\n\n```bash\ncat .//templates/requirements-interview.md\n```\n\n**The workflow:**\n1. Discuss the JTBD (Job to Be Done) — outcomes, not features\n2. Break into Topics of Concern (each passes the \"one sentence\" test)\n3. Write a spec file for each topic: `specs/topic-name.md`\n4. Human reviews and approves specs\n\n**Example output:**\n```\nspecs/\n├── image-collection.md\n├── color-extraction.md\n├── layout-system.md\n└── sharing.md\n```\n\n### 4. Phase 2: Planning\n\n```bash\n./loop.sh plan\n```\n\nWait for `IMPLEMENTATION_PLAN.md` to be generated (usually 1-2 iterations). Review it — this is your task list.\n\n### 5. Phase 3: Building\n\n```bash\n./loop.sh build 20  # Max 20 iterations\n```\n\nWatch it work. Add backpressure (tests, lints) as patterns emerge. Check commits for progress.\n\n---\n\n## Loop Script Options\n\n```bash\n./loop.sh              # Build mode, unlimited\n./loop.sh 20           # Build mode, max 20 iterations\n./loop.sh plan         # Plan mode, unlimited\n./loop.sh plan 5       # Plan mode, max 5 iterations\n```\n\nOr use the Node.js wrapper for more control:\n\n```bash\nnode skills/ralph-loops/scripts/ralph-loop.mjs \\\n  --prompt \"./PROMPT_build.md\" \\\n  --model opus \\\n  --max 20 \\\n  --done \"RALPH_DONE\"\n```\n\n---\n\n## When to Regenerate the Plan\n\nPlans drift. Regenerate when:\n\n- Ralph is going off track (implementing wrong things)\n- Plan feels stale or doesn't match current state\n- Too much clutter from completed items\n- You've made significant spec changes\n- You're confused about what's actually done\n\nJust switch back to planning mode:\n\n```bash\n./loop.sh plan\n```\n\nRegeneration cost is one Planning loop. Cheap compared to Ralph going in circles.\n\n---\n\n## Safety\n\nRalph requires `--dangerously-skip-permissions` to run autonomously. This bypasses Claude's permission system entirely.\n\n**Philosophy:** \"It's not if it gets popped, it's when. And what is the blast radius?\"\n\n**Protections:**\n- Run in isolated environments (Docker, VM)\n- Only the API keys needed for the task\n- No access to private data beyond requirements\n- Restrict network connectivity where possible\n- **Escape hatches:** Ctrl+C stops the loop; `git reset --hard` reverts uncommitted changes\n\n---\n\n## Cost Expectations\n\n| Task Type | Model | Iterations | Est. Cost |\n|-----------|-------|------------|-----------|\n| Generate plan | Opus | 1-2 | $0.50-1.00 |\n| Implement simple feature | Opus | 3-5 | $1.00-2.00 |\n| Implement complex feature | Opus | 10-20 | $3.00-8.00 |\n| Full project buildout | Opus | 50+ | $15-50+ |\n\n**Tip:** Use Sonnet for simpler tasks where plan is clear. Use Opus for planning and complex reasoning.\n\n---\n\n## Real-World Results\n\nFrom Geoffrey Huntley:\n- 6 repos generated overnight at YC hackathon\n- $50k contract completed for $297 in API costs\n- Created entire programming language over 3 months\n\n---\n\n## Advanced: Running as Sub-Agent\n\nFor long loops, spawn as sub-agent so main session stays responsive:\n\n```javascript\nsessions_spawn({\n  task: `cd /path/to/project && ./loop.sh build 20\n         \nSummarize what was implemented when done.`,\n  label: \"ralph-build\",\n  model: \"opus\"\n})\n```\n\nCheck progress:\n```javascript\nsessions_list({ kinds: [\"spawn\"] })\nsessions_history({ label: \"ralph-build\", limit: 5 })\n```\n\n---\n\n## Troubleshooting\n\n### Ralph keeps implementing the same thing\n- Plan is stale → regenerate with `./loop.sh plan`\n- Backpressure missing → add tests that catch duplicates\n\n### Ralph goes in circles\n- Add more specific guardrails to prompts\n- Check if specs are ambiguous\n- Regenerate plan\n\n### Context getting bloated\n- Ensure one task per iteration (check prompt)\n- Keep AGENTS.md under 60 lines\n- Move status/progress to IMPLEMENTATION_PLAN.md, not AGENTS.md\n\n### Tests not running\n- Check AGENTS.md has correct validation commands\n- Ensure backpressure section in prompt references AGENTS.md\n\n---\n\n## Edge Cases\n\n### Projects Without Git\n\nThe loop script expects git for commits and pushes. For projects without version control:\n\n**Option 1: Initialize git anyway** (recommended)\n```bash\ngit init\ngit add -A\ngit commit -m \"Initial commit before Ralph\"\n```\n\n**Option 2: Modify the prompts**\n- Remove git-related guardrails from PROMPT_build.md\n- Remove the git push section from loop.sh\n- Use file backups instead: add `cp -r src/ backups/iteration-$ITERATION/` to loop.sh\n\n**Option 3: Use tarball snapshots**\n```bash\n# Add to loop.sh before each iteration:\ntar -czf \"snapshots/pre-iteration-$ITERATION.tar.gz\" src/\n```\n\n### Very Large Codebases\n\nFor codebases with 100K+ lines:\n\n- **Reduce subagent parallelism:** Change \"up to 500 parallel Sonnet subagents\" to \"up to 50\" in prompts\n- **Scope narrowly:** Use focused specs that target specific directories\n- **Add path restrictions:** In AGENTS.md, note which directories are in-scope\n- **Consider workspace splitting:** Treat large modules as separate Ralph projects\n\n### When Claude CLI Isn't Available\n\nThe methodology works with any Claude interface:\n\n**Claude API directly:**\n```bash\n# Replace loop.sh with API calls using curl or a script\ncurl https://api.anthropic.com/v1/messages \\\n  -H \"x-api-key: $ANTHROPIC_API_KEY\" \\\n  -H \"content-type: application/json\" \\\n  -d '{\"model\": \"claude-sonnet-4-20250514\", \"max_tokens\": 8192, \"messages\": [...]}'\n```\n\n**Alternative agents:**\n- **Aider:** `aider --opus --auto-commits`\n- **Continue.dev:** Use with Claude API key\n- **Cursor:** Composer mode with PROMPT files as context\n\nThe key principles (one task per iteration, fresh context, backpressure) apply regardless of tooling.\n\n### Non-Node.js Projects\n\nAdapt AGENTS.md for your stack:\n\n| Stack | Build | Test | Lint |\n|-------|-------|------|------|\n| Python | `pip install -e .` | `pytest` | `ruff .` |\n| Go | `go build ./...` | `go test ./...` | `golangci-lint run` |\n| Rust | `cargo build` | `cargo test` | `cargo clippy` |\n| Ruby | `bundle install` | `rspec` | `rubocop` |\n\nAlso update path references in prompts (`src/*` → your source directory).\n\n---\n\n## Learn More\n\n- Geoffrey Huntley: https://ghuntley.com/ralph/\n- Clayton Farr's Playbook: https://github.com/ClaytonFarr/ralph-playbook\n- Geoffrey's Fork: https://github.com/ghuntley/how-to-ralph-wiggum\n\n---\n\n## Credits\n\nBuilt by **Johnathan & Q** — a human-AI dyad.\n\n- Twitter: [@spacepixel](https://x.com/spacepixel)\n- ClawdHub: [clawhub.ai/skills/ralph-loops](https://www.clawhub.ai/skills/ralph-loops)\n","tags":{"latest":"1.0.2"},"stats":{"comments":0,"downloads":5718,"installsAllTime":214,"installsCurrent":25,"stars":1,"versions":3},"createdAt":1769935113980,"updatedAt":1779076602641},"latestVersion":{"version":"1.0.2","createdAt":1769936024664,"changelog":"Added warning: don't block conversation while monitoring loops","license":null},"metadata":null,"owner":{"handle":"qlifebot-coder","userId":"s1756662t56by9r513j283w10d885g2n","displayName":"qlifebot-coder","image":"https://avatars.githubusercontent.com/u/258573345?v=4"},"moderation":{"isSuspicious":false,"isMalwareBlocked":false,"verdict":"clean","reasonCodes":["review.llm_review"],"summary":"Review: review.llm_review","engineVersion":"v2.4.24","updatedAt":1779932787125}}