Hepha
Runs autonomous iterative delivery loops for coding tasks using plan -> execute -> check -> review -> commit. Use when the user asks for hepha mode, autopilo...
Like a lobster shell, security has layers — review code before you run it.
License
SKILL.md
Hepha
Purpose
Run each requirement as multiple small, autonomous loops:
plan -> execute -> check -> review -> commit
Keep looping with minimal user intervention until the backlog is done or a stop condition is hit.
Activation
Activate only when the user explicitly asks for:
- hepha / autopilot / autonomous loop / unattended iteration
- continuous plan-execute-check-review-commit flow
- small-step commits until a larger requirement is completed
If the user did not explicitly request hepha, do not force this mode.
Non-Negotiable Operating Rules
- One loop = one smallest shippable sub-task.
- No commit before both engineering checks and browser review pass.
- Every loop must update progress artifacts under
.autopilot/. - If blocked, re-plan automatically; ask user only when truly necessary.
- Prefer minimal diff and avoid unrelated files.
Required Working Artifacts
Create and maintain these files in the project's .autopilot/ directory:
.autopilot/backlog.md- task graph and states (todo,doing,blocked,done).autopilot/progress.md- per-loop execution log and evidence.autopilot/decision-log.md- research and technical decisions
Templates: Use the template files from templates/ in this skill directory as starting points:
templates/backlog.mdtemplates/progress.mdtemplates/decision-log.md
If working files do not exist, copy from templates or create them before the first loop.
Loop Protocol
Execute the following phases in order for each loop.
1) PLAN (Enhanced)
Goal: pick exactly one ready sub-task from the backlog.
Steps:
Step 0.5 - Schema Validation (execute every PLAN):
Verify each task in backlog.md contains:
- ✅
id(format: TASK-XXX or numeric) - ✅
title(action statement) - ✅
state(todo|doing|blocked|done) - ✅
depends_on(array, can be empty) - ✅
acceptance(testable pass conditions) - ✅
risk(low|medium|high) - ✅
files_hint(expected files, optional)
Missing fields → complete before continuing Circular dependencies → detect and report error
Step 0 - Auto-Decomposition (if backlog.md missing or empty):
- Analyze original requirement to identify core functional modules
- Apply decomposition patterns (see
references/decomposition-patterns.md):- Vertical slicing: split by user value path (UI → API → Data)
- Risk-first: high-risk dependencies first
- Independence: each task testable and committable separately
- Generate task graph:
- Assign unique ID to each sub-task (TASK-001, TASK-002...)
- Identify dependencies (depends_on)
- Assess risk level (low/medium/high)
- Define acceptance criteria (acceptance)
- Output to
.autopilot/backlog.md
Step 1 - Normalize and Build Task Graph:
- Normalize current requirement into:
- Goal
- Definition of done
- Constraints
- Out of scope
- Build/refresh task graph:
- Decompose Epic -> Tasks
- For each task, define input/output, acceptance, dependencies, risk
- Select one task from ready queue (all dependencies done).
- Write loop plan into
.autopilot/progress.md:- selected task
- expected files
- expected checks
- expected browser validation path
- Update progress visualization section
2) RESEARCH (explicit trigger conditions)
Goal: make informed decisions with live evidence.
Decision Matrix - Research Required?
| Scenario Category | Specific Situation | Research Required |
|---|---|---|
| New Technology | Using library/framework not in project | ✅ Yes |
| Architecture Change | Affects module boundaries or data flow | ✅ Yes |
| Implementation Uncertainty | 2+ viable options with >30% difference | ✅ Yes |
| Tool Selection | MCP/Playwright/Puppeteer/etc. choice | ✅ Yes |
| CRUD Operations | Standard CRUD | ❌ No |
| Bug Fixes | Clear error fix | ❌ No |
| Style Adjustments | CSS/style class modifications | ❌ No |
Research Quality Requirements:
- Compare at least 2 options
- Prefer official documentation and source code
- Record: option summary → evidence links → tradeoffs → decision rationale
Record in .autopilot/decision-log.md:
- option A / B summary
- evidence links or source notes
- tradeoffs
- final decision and rationale
3) EXECUTE
Goal: implement the chosen sub-task with minimal blast radius.
Rules:
- Keep changes focused on required files only.
- Avoid speculative refactors.
- Keep functions small and reusable.
- Add concise comments only where logic is non-obvious.
4) CHECK
Goal: verify engineering quality.
Run all relevant project checks (examples):
- lint
- tests
- build/typecheck
If any check fails:
- Capture failure details in
.autopilot/progress.md. - Fix the root cause.
- Re-run checks.
- Repeat until pass or retry limit is reached.
5) REVIEW (browser and UX evidence required for UI/flow changes)
Goal: verify behavior from a user perspective, not only compile success.
For UI/interaction changes, use MCP browser tools and/or Playwright to validate:
- page load success
- key interaction path works
- expected text/element state is visible
- major regressions are absent
Attach review evidence to .autopilot/progress.md:
- interaction steps
- observed result
- screenshots/snapshots when relevant
6) COMMIT
Commit only when:
- checks passed
- review passed
- acceptance criteria for selected task are met
Commit policy:
- one loop, one commit
- conventional commit format
- message explains purpose/why, not only what
Update task status in .autopilot/backlog.md to done and append commit hash in progress log.
Re-Planning Policy
Trigger re-plan when:
- dependency changed
- repeated failures suggest wrong approach
- discovered scope mismatch
Re-plan behavior:
- Split the current task into smaller tasks.
- Mark blocked tasks explicitly with reason.
- Continue from next ready task.
Stop Conditions
Stop loop and report clearly if any condition is met:
- No ready task and unresolved blockers remain.
- Same task fails checks/review 2 consecutive loops.
- Required tooling is unavailable (critical checks cannot run).
- User-defined risk boundary is exceeded.
When stopped, provide:
- current status
- blocker root cause
- proposed next actions
Completion Conditions
Consider a large requirement complete only when:
- All backlog tasks are
done. - Requirement-level definition of done is satisfied.
- Relevant checks pass on final state.
- Required review evidence is present.
Then generate a final completion summary:
- completed task list
- key decisions
- risk notes
- follow-up suggestions
Communication Style During Hepha
- Keep user updates brief and frequent.
- Do not ask for confirmation every loop.
- Ask user only for true ambiguity, policy conflicts, or missing credentials.
Suggested Starter Prompt For Users
Use this starter format to begin a run:
- Enable hepha mode.
- Run loop: plan -> execute -> check -> review -> commit.
- Perform web/GitHub research before technical choices.
- For UI flows, perform browser-based validation.
- Continue until backlog is complete or stop condition is met.
- Requirement/backlog: <paste requirement here>.
Additional References
- Planning details:
references/planning_task-decomposition.md - Quality gates:
references/validation_quality-gates.md - Decomposition patterns:
references/decomposition-patterns.md - Progress template:
references/progress-template.md - Working file templates:
templates/backlog.md,templates/progress.md,templates/decision-log.md
Files
8 totalComments
Loading comments…
