claw-superpowers

v1.0.1

Agentic software development methodology — 13 integrated skills for disciplined AI-assisted development covering brainstorming, planning, TDD, debugging, cod...

1· 743· 2 versions· 4 current· 4 all-time· Updated 9h ago· MIT-0

byTandy@cutd

Security Scans

VirusTotalBenign ClawScanSuspicious Static analysisBenign

Install

openclaw skills install claw-superpowers

Superpowers

Disciplined, systematic approach to AI-assisted software development. 13 integrated skills covering the full development lifecycle.

The 7-Step Workflow

Brainstorm — Design before implementation (Section 2)
Plan — Write detailed implementation plan (Section 3)
Execute — Implement with subagents or batches (Sections 4-6)
Test — TDD throughout: RED-GREEN-REFACTOR (Section 7)
Debug — Systematic root cause analysis when needed (Section 8)
Review — Code review after each task (Section 10)
Finish — Branch completion and integration (Section 12)

How to Use

Apply the relevant section based on your current task:

Task	Section
Starting new work	Section 2 (Brainstorming) → Section 3 (Writing Plans)
Implementing a plan (this session)	Section 5 (Subagent-Driven Development)
Implementing a plan (separate session)	Section 4 (Executing Plans)
Multiple independent problems	Section 6 (Dispatching Parallel Agents)
Writing or fixing code	Section 7 (Test-Driven Development)
Bug, test failure, unexpected behavior	Section 8 (Systematic Debugging)
About to claim work is done	Section 9 (Verification Before Completion)
After completing work	Section 10 (Code Review)
Need isolated workspace	Section 11 (Using Git Worktrees)
Ready to merge/integrate	Section 12 (Finishing a Development Branch)
Creating new skills	Section 13 (Writing Skills)

Skill priority: Process skills first (brainstorming, debugging), then implementation skills.

Skill types:

Rigid (TDD, debugging, verification): Follow exactly. Don't adapt away discipline.
Flexible (patterns, worktrees): Adapt principles to context.

1. Using Superpowers

<EXTREMELY-IMPORTANT> If you think there is even a 1% chance a skill might apply to what you are doing, you ABSOLUTELY MUST invoke the skill.

IF A SKILL APPLIES TO YOUR TASK, YOU DO NOT HAVE A CHOICE. YOU MUST USE IT.

This is not negotiable. This is not optional. You cannot rationalize your way out of this. </EXTREMELY-IMPORTANT>

Rule: Invoke relevant skills BEFORE any response or action. Even a 1% chance a section might apply means you should check.

Process:

User message received
Check: might any section apply?
- Yes → Follow that section
- No → Respond directly
If about to plan: check Section 2 (Brainstorming) first
If section has checklist: create todo per checklist item
Announce: "Using [skill] to [purpose]"

Priority order:

Process sections first (brainstorming, debugging)
Implementation sections second

User instructions say WHAT, not HOW. "Add X" or "Fix Y" doesn't mean skip workflows.

Red Flags — You're Rationalizing

Thought	Reality
"This is just a simple question"	Questions are tasks. Check for skills.
"I need more context first"	Skill check comes BEFORE clarifying questions.
"Let me explore the codebase first"	Skills tell you HOW to explore. Check first.
"I can check git/files quickly"	Files lack conversation context. Check for skills.
"Let me gather information first"	Skills tell you HOW to gather information.
"This doesn't need a formal skill"	If a skill exists, use it.
"I remember this skill"	Skills evolve. Read current version.
"This doesn't count as a task"	Action = task. Check for skills.
"The skill is overkill"	Simple things become complex. Use it.
"I'll just do this one thing first"	Check BEFORE doing anything.
"This feels productive"	Undisciplined action wastes time. Skills prevent this.
"I know what that means"	Knowing the concept ≠ using the skill. Invoke it.

2. Brainstorming

Use before: Any creative work — creating features, building components, adding functionality, or modifying behavior.

<HARD-GATE> Do NOT write any code, scaffold any project, or take any implementation action until you have presented a design and the user has approved it. This applies to EVERY project regardless of perceived simplicity. </HARD-GATE>

Anti-Pattern: "This Is Too Simple To Need A Design"

Every project goes through this process. A todo list, a single-function utility, a config change — all of them. "Simple" projects are where unexamined assumptions cause the most wasted work. The design can be short (a few sentences for truly simple projects), but you MUST present it and get approval.

Checklist

Explore project context — check files, docs, recent commits
Ask clarifying questions — one at a time, understand purpose/constraints/success criteria
Propose 2-3 approaches — with trade-offs and your recommendation
Present design — in sections scaled to complexity, get user approval after each section
Write design doc — save to docs/plans/YYYY-MM-DD-<topic>-design.md and commit
Transition to implementation — follow Section 3 (Writing Plans)

Process

Ask one question at a time. Prefer multiple choice.
Propose 2-3 approaches with trade-offs. Lead with your recommendation.
Present design sections incrementally. Get approval after each.
Cover: architecture, components, data flow, error handling, testing.
Apply YAGNI ruthlessly.
Be flexible — go back and clarify when something doesn't make sense.

After Design Approval

Write validated design to docs/plans/YYYY-MM-DD-<topic>-design.md, commit, then follow Section 3 (Writing Plans).

The terminal state is invoking writing-plans. Do NOT invoke any other implementation skill. The ONLY next step after brainstorming is writing-plans.

Key Principles

One question at a time — Don't overwhelm with multiple questions
Multiple choice preferred — Easier to answer than open-ended when possible
YAGNI ruthlessly — Remove unnecessary features from all designs
Explore alternatives — Always propose 2-3 approaches before settling
Incremental validation — Present design, get approval before moving on

3. Writing Plans

Use when: You have a spec or requirements for a multi-step task, before touching code.

Write comprehensive implementation plans with bite-sized tasks. Assume the implementer has zero context. Document exact file paths, complete code, exact commands with expected output.

Announce at start: "I'm using the writing-plans skill to create the implementation plan."

Save plans to: docs/plans/YYYY-MM-DD-<feature-name>.md

Plan Header Template

Every plan MUST start with:

# [Feature Name] Implementation Plan
> **For agent:** REQUIRED SUB-SKILL: Use Section 4 or Section 5 to implement this plan.
**Goal:** [One sentence]
**Architecture:** [2-3 sentences]
**Tech Stack:** [Key technologies]

Task Granularity

Each step is one action (2-5 minutes):

"Write the failing test" — step
"Run it to verify it fails" — step
"Implement minimal code to pass" — step
"Run tests to verify pass" — step
"Commit" — step

Task Structure

### Task N: [Component Name]
**Files:**
- Create: `exact/path/to/file.py`
- Modify: `exact/path/to/existing.py:123-145`
- Test: `tests/exact/path/to/test.py`
**Step 1:** Write failing test (with complete code)
**Step 2:** Run test, verify fails (exact command + expected output)
**Step 3:** Write minimal implementation (complete code)
**Step 4:** Run test, verify passes (exact command + expected output)
**Step 5:** Commit (exact git commands)

Remember

Exact file paths always
Complete code in plan (not "add validation")
Exact commands with expected output
DRY, YAGNI, TDD, frequent commits

Execution Handoff

After saving plan, offer:

"Plan complete and saved. Two execution options:

1. Subagent-Driven (this session) — fresh subagent per task, review between tasks, fast iteration → follow Section 5

2. Parallel Session (separate) — open new session, batch execution with checkpoints → follow Section 4

Which approach?"

4. Executing Plans

Use when: You have a written implementation plan to execute in a separate session with review checkpoints.

Announce at start: "I'm using the executing-plans skill to implement this plan."

Process

Load and review plan — Read critically, raise concerns before starting. If no concerns: create task list and proceed.
Execute batch — Default first 3 tasks. For each: mark in_progress → follow steps exactly → run verifications → mark completed
Report — Show what was implemented, verification output. Say: "Ready for feedback."
Continue — Apply feedback, execute next batch, repeat
Complete — After all tasks verified, follow Section 12 (Finishing a Development Branch)

When to Stop and Ask for Help

STOP executing immediately when:

Hit a blocker mid-batch (missing dependency, test fails, instruction unclear)
Plan has critical gaps preventing starting
You don't understand an instruction
Verification fails repeatedly

Ask for clarification rather than guessing.

When to Revisit Earlier Steps

Return to Review (Step 1) when:

Partner updates the plan based on your feedback
Fundamental approach needs rethinking

Don't force through blockers — stop and ask.

Rules

Follow plan steps exactly
Don't skip verifications
Between batches: report and wait
Stop when blocked, don't guess
Never start implementation on main/master without explicit consent

Integration

REQUIRED: Set up workspace with Section 11 (Git Worktrees) before starting
After all tasks: follow Section 12 (Finishing a Development Branch)

5. Subagent-Driven Development

Use when: Executing implementation plans with independent tasks in the current session. Fresh subagent per task + two-stage review.

Core principle: Fresh subagent per task + two-stage review (spec then quality) = high quality, fast iteration

Process

Read plan, extract all tasks with full text and context. Create task list.
Per task: a. Dispatch implementer subagent (provide full task text + context, don't make them read plan file) b. If subagent asks questions → answer clearly before proceeding c. Subagent implements, tests, commits, self-reviews d. Dispatch spec reviewer subagent → verify code matches spec e. If issues → implementer fixes → re-review until ✅ f. Dispatch code quality reviewer subagent g. If issues → implementer fixes → re-review until ✅ h. Mark task complete
After all tasks: Dispatch final code reviewer for entire implementation
Complete: Follow Section 12 (Finishing a Development Branch)

Implementer Subagent Prompt

Provide: full task text, context (where it fits), working directory. Tell them:

Implement exactly what task specifies
Write tests (TDD if task says to)
If you have questions about requirements, approach, or anything unclear — ask them now before starting
Commit work
Self-review: completeness (all requirements?), quality (clean, maintainable?), discipline (YAGNI?), testing (real behavior, not mocks?)
If you find issues during self-review, fix them now before reporting
Report: what implemented, test results, files changed, self-review findings, concerns

Spec Reviewer Prompt

Provide: full task requirements, implementer's report. Tell them:

CRITICAL: Do NOT trust the implementer's report — read actual code
The implementer may be incomplete, inaccurate, or optimistic
Check by reading code: missing requirements, extra/unneeded work, misunderstandings
Report: ✅ spec compliant or ❌ issues with file:line references

Code Quality Reviewer Prompt

Use code review template (see Section 10). Provide: BASE_SHA, HEAD_SHA, description.

Only dispatch after spec compliance passes
Returns: Strengths, Issues (Critical/Important/Minor), Assessment

Rules

Never start on main/master without consent
Never skip reviews (spec OR quality)
Spec review BEFORE code quality review
Don't dispatch parallel implementation subagents (conflicts)
Don't move to next task with open review issues
If subagent fails → dispatch fix subagent, don't fix manually (context pollution)
Don't make subagent read plan file (provide full text instead)
Accept "close enough" on spec compliance → NOT acceptable (spec issues = not done)
Start code quality review before spec compliance is ✅ → WRONG ORDER

6. Dispatching Parallel Agents

Use when: 2+ independent tasks that can be worked on without shared state or sequential dependencies.

When to Use

3+ test files failing with different root causes
Multiple subsystems broken independently
Each problem can be understood without context from others
No shared state between investigations

Don't use when:

Failures are related (fix one might fix others)
Need full system context to understand
Agents would interfere (editing same files, shared resources)
Exploratory debugging (you don't know what's broken yet)

Pattern

Identify independent domains — group failures by what's broken
Create focused agent tasks — each gets: specific scope, clear goal, constraints, expected output format
Dispatch in parallel — one agent per domain
Review and integrate — read summaries, verify no conflicts, run full test suite

Agent Prompt Structure

Good prompts are: focused (one problem domain), self-contained (all context needed), specific about output (what to return).

❌ Too broad: "Fix all the tests" — agent gets lost
✅ Specific: "Fix agent-tool-abort.test.ts — 3 failures listed below"
❌ No constraints: agent might refactor everything
✅ Constrained: "Do NOT change production code" or "Fix tests only"
❌ Vague output: "Fix it" — you don't know what changed
✅ Specific: "Return summary of root cause and changes"

Real Example

Scenario: 6 test failures across 3 files after major refactoring

Dispatch:

Agent 1 → Fix agent-tool-abort.test.ts (3 timing issues)
Agent 2 → Fix batch-completion-behavior.test.ts (2 event structure bugs)
Agent 3 → Fix tool-approval-race-conditions.test.ts (1 async issue)

Results: All fixes independent, no conflicts, full suite green. 3 problems solved in time of 1.

7. Test-Driven Development

Use when: Implementing any feature or bugfix, before writing implementation code.

Core principle: If you didn't watch the test fail, you don't know if it tests the right thing.

Violating the letter of the rules is violating the spirit of the rules.

The Iron Law

NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST

Write code before the test? Delete it. Start over. No exceptions:

Don't keep it as "reference"
Don't "adapt" it while writing tests
Don't look at it
Delete means delete

Implement fresh from tests. Period.

RED — Write Failing Test

Write one minimal test showing what should happen.

Good:

test('retries failed operations 3 times', async () => {
  let attempts = 0;
  const operation = () => {
    attempts++;
    if (attempts < 3) throw new Error('fail');
    return 'success';
  };
  const result = await retryOperation(operation);
  expect(result).toBe('success');
  expect(attempts).toBe(3);
});

Clear name, tests real behavior, one thing.

Bad:

test('retry works', async () => {
  const mock = jest.fn()
    .mockRejectedValueOnce(new Error())
    .mockResolvedValueOnce('success');
  await retryOperation(mock);
  expect(mock).toHaveBeenCalledTimes(2);
});

Vague name, tests mock not code.

Requirements: One behavior. Clear name. Real code (no mocks unless unavoidable).

Verify RED

MANDATORY. Never skip.

Run test. Confirm: fails (not errors), expected failure message, fails because feature missing.

Test passes? You're testing existing behavior. Fix test.
Test errors? Fix error, re-run until it fails correctly.

GREEN — Minimal Code

Write simplest code to pass the test. Don't add features, refactor, or "improve" beyond the test.

Good: Just enough to pass. Bad: Over-engineered with options/backoff/callbacks the test doesn't require (YAGNI).

Verify GREEN

MANDATORY.

Run test. Confirm: passes, other tests still pass, output pristine (no errors, warnings).

Test fails? Fix code, not test.
Other tests fail? Fix now.

REFACTOR

After green only: remove duplication, improve names, extract helpers. Keep tests green. Don't add behavior.

Why Order Matters

"I'll write tests after" — Tests written after code pass immediately. Passing immediately proves nothing: might test wrong thing, might test implementation not behavior, might miss edge cases. Test-first forces you to see the test fail, proving it actually tests something.

"Already manually tested" — Manual testing is ad-hoc. No record, can't re-run, easy to forget cases. Automated tests are systematic.

"Deleting X hours is wasteful" — Sunk cost fallacy. The time is already gone. Keeping unverified code is technical debt.

"TDD is dogmatic" — TDD IS pragmatic: finds bugs before commit, prevents regressions, documents behavior, enables refactoring. "Pragmatic" shortcuts = debugging in production = slower.

Bug Fix Example

Bug: Empty email accepted

RED:

test('rejects empty email', async () => {
  const result = await submitForm({ email: '' });
  expect(result.error).toBe('Email required');
});

Verify RED: FAIL — expected 'Email required', got undefined ✓

GREEN:

function submitForm(data: FormData) {
  if (!data.email?.trim()) return { error: 'Email required' };
  // ...
}

Verify GREEN: PASS ✓ → REFACTOR → extract validation if needed.

Verification Checklist

Before marking work complete:

Every new function/method has a test
Watched each test fail before implementing
Each test failed for expected reason
Wrote minimal code to pass each test
All tests pass
Output pristine (no errors, warnings)
Tests use real code (mocks only if unavoidable)
Edge cases and errors covered

Can't check all boxes? You skipped TDD. Start over.

When Stuck

Problem	Solution
Don't know how to test	Write wished-for API. Write assertion first. Ask your human partner.
Test too complicated	Design too complicated. Simplify interface.
Must mock everything	Code too coupled. Use dependency injection.
Test setup huge	Extract helpers. Still complex? Simplify design.

Testing Anti-Patterns

When adding mocks or test utilities, avoid these pitfalls:

Testing mock behavior instead of real behavior — Assert on real component output, not mock existence
Adding test-only methods to production classes — Move to test utilities instead
Mocking without understanding dependencies — Understand side effects before mocking; mock at lowest level needed
Incomplete mocks — Mock the COMPLETE data structure as it exists in reality
Tests as afterthought — Testing is part of implementation, not optional follow-up

Common Rationalizations

Excuse	Reality
"Too simple to test"	Simple code breaks. Test takes 30 seconds.
"I'll test after"	Tests passing immediately prove nothing.
"Tests after achieve same goals"	Tests-after = "what does this do?" Tests-first = "what should this do?"
"Already manually tested"	Ad-hoc ≠ systematic. No record, can't re-run.
"Deleting X hours is wasteful"	Sunk cost fallacy. Keeping unverified code is debt.
"Keep as reference"	You'll adapt it. That's testing after. Delete means delete.
"Need to explore first"	Fine. Throw away exploration, start with TDD.
"Test hard = design unclear"	Listen to test. Hard to test = hard to use.
"TDD will slow me down"	TDD faster than debugging.
"Existing code has no tests"	You're improving it. Add tests for existing code.
"It's about spirit not ritual"	Violating the letter IS violating the spirit.

Red Flags — STOP and Start Over

Code before test. Test after implementation. Test passes immediately. Can't explain why test failed. Rationalizing "just this once." "I already manually tested it." "Keep as reference." "This is different because..."

All mean: Delete code. Start over with TDD.

8. Systematic Debugging

Use when: Any bug, test failure, or unexpected behavior — before proposing fixes.

Core principle: ALWAYS find root cause before attempting fixes. Symptom fixes are failure.

Violating the letter of this process is violating the spirit of debugging.

The Iron Law

NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST

If you haven't completed Phase 1, you cannot propose fixes.

Use this ESPECIALLY when:

Under time pressure (emergencies make guessing tempting)
"Just one quick fix" seems obvious
You've already tried multiple fixes
You don't fully understand the issue

Phase 1: Root Cause Investigation

BEFORE attempting ANY fix:

Read error messages carefully — don't skip. Read stack traces completely. Note line numbers, file paths, error codes.
Reproduce consistently — exact steps, every time. Not reproducible → gather more data, don't guess.
Check recent changes — git diff, recent commits, new dependencies, config changes, environmental differences.

Gather evidence in multi-component systems — Log data at each component boundary. Run once to see WHERE it breaks. Then investigate that component.

Example: For CI → build → signing pipeline:

# Log at each layer boundary:
# Layer 1 (workflow): echo "IDENTITY: ${IDENTITY:+SET}${IDENTITY:-UNSET}"
# Layer 2 (build): env | grep IDENTITY
# Layer 3 (signing): security find-identity -v
# Reveals: which layer the propagation fails

Trace data flow — Where does bad value originate? Trace up the call stack to the source. Fix at source, not symptom.

Root cause tracing technique: Observe symptom → find immediate cause → ask "what called this?" → keep tracing up → find original trigger → fix at source.

Example: git init in wrong directory → cwd: '' → createWorktree('') → Session.create() → test accessed tempDir before beforeEach → root cause: top-level variable initialization.

Phase 2: Pattern Analysis

Find working examples of similar code in the codebase
Compare: what's different between working and broken?
List every difference, however small
Understand dependencies and assumptions

Phase 3: Hypothesis and Testing

Form single hypothesis: "X is root cause because Y"
Make SMALLEST possible change to test it
One variable at a time
Didn't work → new hypothesis. DON'T add more fixes on top.

Phase 4: Implementation

Create failing test reproducing the bug (Section 7: TDD)
Implement single fix addressing root cause
Verify: test passes, no other tests broken
If 3+ fixes failed: STOP. Question the architecture. Discuss with your human partner.

Pattern indicating architectural problem:
- Each fix reveals new shared state/coupling/problem in different place
- Fixes require "massive refactoring" to implement
- Each fix creates new symptoms elsewhere
This is NOT a failed hypothesis — this is a wrong architecture.

Supporting Techniques

Defense-in-depth: After finding root cause, validate at EVERY layer data passes through. Entry point → business logic → environment guards → debug instrumentation. Single validation can be bypassed; multiple layers make the bug structurally impossible.
Condition-based waiting: Replace arbitrary timeouts/sleeps with condition polling. Wait for the actual condition, not a guess about timing. waitFor(() => condition()) instead of setTimeout(check, 500).

Common Rationalizations

Excuse	Reality
"Issue is simple"	Simple issues have root causes too.
"Emergency, no time"	Systematic is FASTER than thrashing.
"Just try this first"	First fix sets the pattern. Do it right.
"I see the problem"	Seeing symptoms ≠ understanding root cause.
"Multiple fixes saves time"	Can't isolate what worked. Causes new bugs.
"One more fix attempt" (after 2+)	3+ failures = architectural problem. Question pattern.
"Reference too long, I'll adapt"	Partial understanding guarantees bugs. Read completely.

Red Flags — STOP and Follow Process

"Quick fix for now, investigate later"
"Just try changing X and see"
Proposing solutions before tracing data flow
"I don't fully understand but this might work"
Each fix reveals new problem in different place
3+ fix attempts failed → question the architecture

ALL of these mean: STOP. Return to Phase 1.

9. Verification Before Completion

Use when: About to claim work is complete, fixed, or passing — before committing or creating PRs.

Core principle: Evidence before claims, always.

Violating the letter of this rule is violating the spirit of this rule.

The Iron Law

NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE

If you haven't run the verification command in this message, you cannot claim it passes.

The Gate

Before claiming ANY status:

1. IDENTIFY: What command proves this claim?
2. RUN: Execute the FULL command (fresh, complete)
3. READ: Full output, check exit code, count failures
4. VERIFY: Does output confirm the claim?
   - If NO: State actual status with evidence
   - If YES: State claim WITH evidence
5. ONLY THEN: Make the claim

Skip any step = lying, not verifying

What Counts as Verification

Claim	Requires	NOT Sufficient
Tests pass	Test output: 0 failures	Previous run, "should pass"
Linter clean	Linter output: 0 errors	Partial check, extrapolation
Build succeeds	Build: exit 0	Linter passing
Bug fixed	Test original symptom	Code changed, assumed fixed
Regression test works	Red-green cycle verified	Test passes once
Agent completed	VCS diff shows changes	Agent reports "success"
Requirements met	Line-by-line checklist	Tests passing

Key Patterns

Tests:

✅ [Run test command] [See: 34/34 pass] "All tests pass"
❌ "Should pass now" / "Looks correct"

Regression tests (TDD Red-Green):

✅ Write → Run (pass) → Revert fix → Run (MUST FAIL) → Restore → Run (pass)
❌ "I've written a regression test" (without red-green verification)

Agent delegation:

✅ Agent reports success → Check VCS diff → Verify changes → Report actual state
❌ Trust agent report

Red Flags

Using "should", "probably", "seems to"
Expressing satisfaction before verification ("Great!", "Perfect!", "Done!")
Trusting agent success reports
Relying on partial verification
ANY wording implying success without having run verification

Run the command. Read the output. THEN claim the result.

Rationalization Prevention

Excuse	Reality
"Should work now"	RUN the verification
"I'm confident"	Confidence ≠ evidence
"Just this once"	No exceptions
"Agent said success"	Verify independently
"Partial check is enough"	Partial proves nothing
"Different words so rule doesn't apply"	Spirit over letter

10. Code Review

Use when: Completing tasks, implementing major features, before merging, or when receiving review feedback.

Part 1: Requesting Review

Mandatory after: each task in subagent-driven development, major features, before merge.

How to request:

Get git SHAs: BASE_SHA=$(git rev-parse HEAD~1), HEAD_SHA=$(git rev-parse HEAD)
Dispatch code-reviewer subagent with the template below
Act on feedback: fix Critical immediately, fix Important before proceeding, note Minor for later

Code Reviewer Template:

You are reviewing code changes for production readiness.
Review {WHAT_WAS_IMPLEMENTED} against {PLAN_OR_REQUIREMENTS}.
Git range: {BASE_SHA}..{HEAD_SHA}

Check: code quality, architecture, testing, requirements, production readiness.

Review Checklist:
- Code: separation of concerns, error handling, type safety, DRY, edge cases
- Architecture: design decisions, scalability, performance, security
- Testing: tests test logic not mocks, edge cases, integration tests, all passing
- Requirements: all met, matches spec, no scope creep, breaking changes documented

Output format:
### Strengths — what's well done (be specific, file:line)
### Issues
#### Critical (Must Fix) — bugs, security, data loss
#### Important (Should Fix) — architecture, missing features, test gaps
#### Minor (Nice to Have) — style, optimization
For each: file:line, what's wrong, why it matters, how to fix
### Assessment — Ready to merge? [Yes/No/With fixes]

Part 2: Receiving Feedback

Response pattern: READ → UNDERSTAND → VERIFY → EVALUATE → RESPOND → IMPLEMENT

Forbidden responses — NEVER say:

"You're absolutely right!" (performative)
"Great point!" / "Excellent feedback!" (performative)
"Let me implement that now" (before verification)

Instead: Restate the technical requirement, ask clarifying questions, push back if wrong, or just fix it.

Handling unclear feedback:

IF any item is unclear:
  STOP — do not implement anything yet
  ASK for clarification on ALL unclear items
  WHY: Items may be related. Partial understanding = wrong implementation.

Implementation order for multi-item feedback:

Clarify anything unclear FIRST
Blocking issues (breaks, security)
Simple fixes (typos, imports)
Complex fixes (refactoring, logic)
Test each fix individually

YAGNI check: If reviewer suggests "implementing properly," grep codebase for actual usage. If unused → "This endpoint isn't called. Remove it (YAGNI)?"

When to push back:

Suggestion breaks existing functionality
Reviewer lacks full context
Violates YAGNI (unused feature)
Technically incorrect for this stack
Conflicts with architectural decisions

When feedback IS correct:

✅ "Fixed. [Brief description]"
✅ [Just fix it in code]
❌ "You're absolutely right!" / "Thanks for catching that!"

Actions speak. Just fix it.

11. Using Git Worktrees

Use when: Starting feature work that needs isolation or before executing implementation plans.

Announce at start: "I'm using the using-git-worktrees skill to set up an isolated workspace."

Directory Selection (priority order)

Check existing: ls -d .worktrees 2>/dev/null (preferred) or ls -d worktrees 2>/dev/null
- If found: use that directory. If both exist, .worktrees wins.
Check AGENTS.md: grep -i "worktree.*director" AGENTS.md 2>/dev/null
- If preference specified: use without asking.
Ask user: .worktrees/ (project-local, hidden) or ~/.config/superpowers/worktrees/<project>/ (global)

Safety Verification

For project-local directories: verify directory is in .gitignore before creating worktree.

git check-ignore -q .worktrees 2>/dev/null

If NOT ignored:

Add appropriate line to .gitignore
Commit the change
Proceed with worktree creation

Why critical: Prevents accidentally committing worktree contents to repository.

Creation Steps

Detect project: project=$(basename "$(git rev-parse --show-toplevel)")
Create worktree: git worktree add "$path" -b "$BRANCH_NAME"

Run project setup (auto-detect):

if [ -f package.json ]; then npm install; fi
if [ -f Cargo.toml ]; then cargo build; fi
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
if [ -f pyproject.toml ]; then poetry install; fi
if [ -f go.mod ]; then go mod download; fi

Verify clean baseline — run tests. If tests fail: report failures, ask whether to proceed.
Report: location, test results, ready status

Rules

Never create worktree without verifying it's ignored (project-local)
Never skip baseline test verification
Never proceed with failing tests without asking
Follow directory priority: existing → AGENTS.md → ask
Auto-detect setup commands from project files

12. Finishing a Development Branch

Use when: Implementation is complete, all tests pass, ready to integrate.

Announce at start: "I'm using the finishing-a-development-branch skill to complete this work."

Process

Verify tests pass — if failing, stop. Cannot proceed.
Determine base branch — git merge-base HEAD main 2>/dev/null || git merge-base HEAD master 2>/dev/null

Present exactly 4 options:

Implementation complete. What would you like to do?
1. Merge back to <base-branch> locally
2. Push and create a Pull Request
3. Keep the branch as-is
4. Discard this work

Execute choice
Cleanup worktree (for options 1, 2, 4 only)

Option Details

Option	Merge	Push	Keep Worktree	Cleanup Branch
1. Merge locally	Yes	-	-	Yes
2. Create PR	-	Yes	Yes	-
3. Keep as-is	-	-	Yes	-
4. Discard	-	-	-	Yes (force)

Option 1 (Merge): Checkout base → pull latest → merge feature → verify tests on merged result → delete feature branch → cleanup worktree

Option 2 (Push/PR): Push branch → gh pr create with summary and test plan → cleanup worktree

Option 4 (Discard): Require typed "discard" confirmation. Show what will be deleted (branch, commits, worktree). Then: checkout base → git branch -D → cleanup worktree

Rules

Never proceed with failing tests
Always verify tests on merged result (option 1)
Require typed "discard" confirmation (option 4)
Don't force-push without explicit request
Present exactly 4 options — don't use open-ended questions

13. Writing Skills

Use when: Creating new skills, editing existing skills, or verifying skills work.

What is a Skill?

A reusable reference guide for proven techniques, patterns, or tools. NOT narratives about how you solved a problem once.

Skills are: Reusable techniques, patterns, tools, reference guides Skills are NOT: One-off solutions, project-specific conventions, mechanical constraints (automate those instead)

The Iron Law (Same as TDD)

NO SKILL WITHOUT A FAILING TEST FIRST

Test with subagent pressure scenarios. Watch agents fail without skill (RED). Write minimal skill (GREEN). Close loopholes (REFACTOR).

SKILL.md Structure

---
name: skill-name-with-hyphens
description: "Use when [specific triggering conditions]"
---

Sections: Overview → When to Use → Core Pattern → Quick Reference → Common Mistakes

Description Field (Critical for Discovery)

CRITICAL: Description = When to Use, NOT What the Skill Does

Agents read description to decide which skills to load. The description should ONLY describe triggering conditions. Do NOT summarize the skill's process or workflow.

Why: Testing revealed that when a description summarizes workflow, agents follow the description shortcut instead of reading the full skill. A description saying "code review between tasks" caused an agent to do ONE review, even though the skill specified TWO reviews (spec + quality).

# ❌ BAD: Summarizes workflow — agent may follow this instead of reading skill
description: Use when executing plans - dispatches subagent per task with code review

# ✅ GOOD: Just triggering conditions
description: Use when executing implementation plans with independent tasks

Naming

Verb-first, active voice: condition-based-waiting not async-test-helpers
Gerunds work well: creating-skills, testing-skills
Name by what you DO: root-cause-tracing not debugging-techniques

Key Rules

One excellent example beats many mediocre ones
Flowcharts only for non-obvious decisions
Keep inline unless reference >100 lines
Test BEFORE deploying. No exceptions.

Bulletproofing Against Rationalization

Skills that enforce discipline need to resist rationalization:

Close every loophole explicitly — Don't just state the rule; forbid specific workarounds ("Don't keep it as reference, don't adapt it, delete means delete")
Address "spirit vs letter" — Add "Violating the letter IS violating the spirit" early
Build rationalization table — Capture every excuse agents make, add counters
Create red flags list — Make it easy to self-check when rationalizing

Skill Creation Checklist

RED: Create pressure scenarios → run WITHOUT skill → document baseline failures (exact rationalizations agents use) GREEN: Write skill addressing specific failures → run WITH skill → verify compliance REFACTOR: Find new rationalizations → add counters → re-test until bulletproof

Common Rationalizations for Skipping Testing

Excuse	Reality
"Skill is obviously clear"	Clear to you ≠ clear to agents. Test it.
"It's just a reference"	References can have gaps. Test retrieval.
"Testing is overkill"	Untested skills always have issues. 15 min saves hours.
"I'll test if problems emerge"	Problems = agents can't use skill. Test BEFORE deploying.
"No time to test"	Deploying untested skill wastes more time fixing later.

Version tags

latestvk974593bej9byqymsk2k2zvgxn82mjb8

Runtime requirements

⚡ Clawdis