{"skill":{"slug":"runesleo-systematic-debugging","displayName":"Systematic Debugging","summary":"Four-phase debugging framework that ensures root cause investigation before attempting fixes. Never jump to solutions.","description":"---\nname: Systematic Debugging\ndescription: Four-phase debugging framework that ensures root cause investigation before attempting fixes. Never jump to solutions.\nwhen_to_use: when encountering any bug, test failure, or unexpected behavior, before proposing fixes\nversion: 3.0.0\nlanguages: all\nattribution: Based on obra/superpowers-skills (MIT License). Phase 0 and additional patterns added.\n---\n\n# Systematic Debugging\n\n## Overview\n\nRandom fixes waste time and create new bugs. Quick patches mask underlying issues.\n\n**Core principle:** ALWAYS find root cause before attempting fixes. Symptom fixes are failure.\n\n**Violating the letter of this process is violating the spirit of debugging.**\n\n## The Iron Law\n\n```\nNO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST\nNO INVESTIGATION WITHOUT CONTEXT RECALL FIRST\n```\n\nIf you haven't completed Phase 0, you cannot proceed to Phase 1.\nIf you haven't completed Phase 1, you cannot propose fixes.\n\n## When to Use\n\nUse for ANY technical issue:\n- Test failures\n- Bugs in production\n- Unexpected behavior\n- Performance problems\n- Build failures\n- Integration issues\n\n**Use this ESPECIALLY when:**\n- Under time pressure (emergencies make guessing tempting)\n- \"Just one quick fix\" seems obvious\n- You've already tried multiple fixes\n- Previous fix didn't work\n- You don't fully understand the issue\n\n**Don't skip when:**\n- Issue seems simple (simple bugs have root causes too)\n- You're in a hurry (rushing guarantees rework)\n- Manager wants it fixed NOW (systematic is faster than thrashing)\n\n## The Five Phases\n\nYou MUST complete each phase before proceeding to the next.\n\n### Phase 0: Context Recall (MANDATORY FIRST STEP)\n\n**BEFORE doing ANYTHING else:**\n\n1. **Extract Keywords from Error**\n   - What's the error type? (OOM, timeout, connection, type error...)\n   - What component? (server, browser, API, database...)\n   - What area of the codebase?\n\n2. **Search for Prior Knowledge**\n   - Check project docs, MEMORY files, or past conversations\n   - Search codebase for similar error patterns: `grep -r \"ErrorType\" .`\n   - Check git log for related recent changes: `git log --oneline -20`\n\n3. **Review Results**\n   - Found relevant experience? -> **Apply directly, skip to Phase 4**\n   - Found partial match? -> Use as starting point for Phase 1\n   - Nothing found? -> Proceed to Phase 1, **remember to record solution later**\n\n4. **Output Format**\n   ```\n   Context Recall:\n   - Query: \"xxx\"\n   - Found: [description of related knowledge]\n   - Action: [apply experience / continue investigation / no match]\n   ```\n\n**VIOLATION:** Proceeding to Phase 1 without Context Recall output = process failure.\n\n---\n\n### Phase 1: Root Cause Investigation\n\n**BEFORE attempting ANY fix:**\n\n1. **Read Error Messages Carefully**\n   - Don't skip past errors or warnings\n   - They often contain the exact solution\n   - Read stack traces completely\n   - Note line numbers, file paths, error codes\n\n2. **Reproduce Consistently**\n   - Can you trigger it reliably?\n   - What are the exact steps?\n   - Does it happen every time?\n   - If not reproducible -> gather more data, don't guess\n\n3. **Check Recent Changes**\n   - What changed that could cause this?\n   - Git diff, recent commits\n   - New dependencies, config changes\n   - Environmental differences\n\n4. **Gather Evidence in Multi-Component Systems**\n\n   **WHEN system has multiple components (CI -> build -> signing, API -> service -> database):**\n\n   **BEFORE proposing fixes, add diagnostic instrumentation:**\n   ```\n   For EACH component boundary:\n     - Log what data enters component\n     - Log what data exits component\n     - Verify environment/config propagation\n     - Check state at each layer\n\n   Run once to gather evidence showing WHERE it breaks\n   THEN analyze evidence to identify failing component\n   THEN investigate that specific component\n   ```\n\n5. **Trace Data Flow**\n   - Where does bad value originate?\n   - What called this with bad value?\n   - Keep tracing up until you find the source\n   - Fix at source, not at symptom\n\n### Phase 2: Pattern Analysis\n\n**Find the pattern before fixing:**\n\n1. **Find Working Examples**\n   - Locate similar working code in same codebase\n   - What works that's similar to what's broken?\n\n2. **Compare Against References**\n   - If implementing pattern, read reference implementation COMPLETELY\n   - Don't skim - read every line\n   - Understand the pattern fully before applying\n\n3. **Identify Differences**\n   - What's different between working and broken?\n   - List every difference, however small\n   - Don't assume \"that can't matter\"\n\n4. **Understand Dependencies**\n   - What other components does this need?\n   - What settings, config, environment?\n   - What assumptions does it make?\n\n### Phase 3: Hypothesis and Testing\n\n**Scientific method:**\n\n1. **Form Single Hypothesis**\n   - State clearly: \"I think X is the root cause because Y\"\n   - Write it down\n   - Be specific, not vague\n\n2. **Test Minimally**\n   - Make the SMALLEST possible change to test hypothesis\n   - One variable at a time\n   - Don't fix multiple things at once\n\n3. **Verify Before Continuing**\n   - Did it work? Yes -> Phase 4\n   - Didn't work? Form NEW hypothesis\n   - DON'T add more fixes on top\n\n4. **When You Don't Know**\n   - Say \"I don't understand X\"\n   - Don't pretend to know\n   - Ask for help\n   - Research more\n\n### Phase 4: Implementation\n\n**Fix the root cause, not the symptom:**\n\n1. **Create Failing Test Case**\n   - Simplest possible reproduction\n   - Automated test if possible\n   - One-off test script if no framework\n   - MUST have before fixing\n\n2. **Implement Single Fix**\n   - Address the root cause identified\n   - ONE change at a time\n   - No \"while I'm here\" improvements\n   - No bundled refactoring\n\n3. **Verify Fix**\n   - Test passes now?\n   - No other tests broken?\n   - Issue actually resolved?\n\n4. **If Fix Doesn't Work**\n   - STOP\n   - Count: How many fixes have you tried?\n   - If < 3: Return to Phase 1, re-analyze with new information\n   - **If >= 3: STOP and question the architecture (step 5 below)**\n   - DON'T attempt Fix #4 without architectural discussion\n\n5. **If 3+ Fixes Failed: Question Architecture**\n\n   **Pattern indicating architectural problem:**\n   - Each fix reveals new shared state/coupling/problem in different place\n   - Fixes require \"massive refactoring\" to implement\n   - Each fix creates new symptoms elsewhere\n\n   **STOP and question fundamentals:**\n   - Is this pattern fundamentally sound?\n   - Are we \"sticking with it through sheer inertia\"?\n   - Should we refactor architecture vs. continue fixing symptoms?\n\n   **Discuss with the user before attempting more fixes.**\n\n## Red Flags - STOP and Follow Process\n\nIf you catch yourself thinking:\n- \"Quick fix for now, investigate later\"\n- \"Just try changing X and see if it works\"\n- \"Add multiple changes, run tests\"\n- \"Skip the test, I'll manually verify\"\n- \"It's probably X, let me fix that\"\n- \"I don't fully understand but this might work\"\n- \"Pattern says X but I'll adapt it differently\"\n- \"Here are the main problems: [lists fixes without investigation]\"\n- Proposing solutions before tracing data flow\n- **\"One more fix attempt\" (when already tried 2+)**\n- **Each fix reveals new problem in different place**\n\n**ALL of these mean: STOP. Return to Phase 1.**\n\n**If 3+ fixes failed:** Question the architecture (see Phase 4.5)\n\n## Common Rationalizations\n\n| Excuse | Reality |\n|--------|---------|\n| \"Issue is simple, don't need process\" | Simple issues have root causes too. Process is fast for simple bugs. |\n| \"Emergency, no time for process\" | Systematic debugging is FASTER than guess-and-check thrashing. |\n| \"Just try this first, then investigate\" | First fix sets the pattern. Do it right from the start. |\n| \"I'll write test after confirming fix works\" | Untested fixes don't stick. Test first proves it. |\n| \"Multiple fixes at once saves time\" | Can't isolate what worked. Causes new bugs. |\n| \"Reference too long, I'll adapt the pattern\" | Partial understanding guarantees bugs. Read it completely. |\n| \"I see the problem, let me fix it\" | Seeing symptoms != understanding root cause. |\n| \"One more fix attempt\" (after 2+ failures) | 3+ failures = architectural problem. Question pattern, don't fix again. |\n\n## Quick Reference\n\n| Phase | Key Activities | Success Criteria |\n|-------|---------------|------------------|\n| **0. Context Recall** | Extract keywords, search prior knowledge | Output recall summary |\n| **1. Root Cause** | Read errors, reproduce, check changes, gather evidence | Understand WHAT and WHY |\n| **2. Pattern** | Find working examples, compare | Identify differences |\n| **3. Hypothesis** | Form theory, test minimally | Confirmed or new hypothesis |\n| **4. Implementation** | Create test, fix, verify | Bug resolved, tests pass |\n\n## Real-World Impact\n\nFrom debugging sessions:\n- Systematic approach: 15-30 minutes to fix\n- Random fixes approach: 2-3 hours of thrashing\n- First-time fix rate: 95% vs 40%\n- New bugs introduced: Near zero vs common\n","tags":{"debugging":"3.0.0","developer-tools":"3.0.0","latest":"3.0.0","workflow":"3.0.0"},"stats":{"comments":0,"downloads":10079,"installsAllTime":73,"installsCurrent":71,"stars":9,"versions":1},"createdAt":1772890916061,"updatedAt":1778491764944},"latestVersion":{"version":"3.0.0","createdAt":1772890916061,"changelog":"- Major update: Introduces a comprehensive four-phase (plus pre-phase) debugging process focused on root cause analysis before any fixes.\n- Adds a mandatory Phase 0: Context Recall—explicitly requires gathering prior knowledge and analyzing project context before investigating or proposing fixes.\n- Clearly defines strict phase order (Context Recall → Root Cause Investigation → Pattern Analysis → Hypothesis & Testing → Implementation) with rules to prevent skipping steps or guessing fixes.\n- Expands guidance for multi-component systems, systematic data/log gathering, and evidence-based analysis.\n- Includes red flags and common rationalizations to help users avoid pitfalls and maintain disciplined debugging.\n- Emphasizes test creation before fixing and sets hard limits on repeated failed fixes (triggers architectural review after three non-working attempts).","license":null},"metadata":null,"owner":{"handle":"runesleo","userId":"s17610fgthy0xhqwnxhhxv230x884fvk","displayName":"runesleo","image":"https://avatars.githubusercontent.com/u/245669230?v=4"},"moderation":null}