Agentic Engineering

Workflow pattern for AI-assisted engineering using eval-first execution, task decomposition, and cost-aware model routing. Trigger phrases: agentic engineering, eval-first workflow, decompose tasks, model routing, cost discipline, task completion criteria.

Deonte Cooper@djc00p

Install

openclaw skills install @djc00p/agentic-engineering-ecc

Agentic Engineering

Operate as an agentic engineer using eval-first execution, decomposition, and cost-aware model routing. Adapted from everything-claude-code by @affaan-m (MIT).

Quick Start

Define completion criteria — write acceptance criteria and success metrics before execution
Create baseline evals — write capability and regression tests that capture current state
Decompose work — break into 15-minute units, each independently verifiable with a single dominant risk
Route models by complexity — Haiku for narrow tasks, Sonnet for implementation, Opus for architecture
Run post-implementation evals — measure deltas, confirm no regressions

Key Concepts

Eval-first execution: Run tests before coding; measure against known baseline; catch regressions early
15-minute unit rule: Each task should have one clear risk, one verifiable outcome, be completable in ~15 minutes
Model tier matching: Complexity determines model — don't overpay for simple tasks, don't underpay for hard ones
Review focus: Prioritize invariants, error boundaries, security, coupling — not style (automation handles that)
Session strategy: Continue for coupled units; reset after major phase transitions; compact at milestones

Common Usage

Setting up eval-first for a feature:

text

1. Define acceptance criteria (user-facing behavior)
2. Write capability eval (can the system do the required task?)
3. Write regression eval (does existing functionality still work?)
4. Execute feature implementation with model routing
5. Re-run evals, compare deltas
6. Document any new risks discovered during review

Model routing example:

Haiku: boilerplate generation, narrow edits, classification
Sonnet: feature implementation, small refactors, test writing
Opus: multi-file changes, root-cause analysis, architecture decisions

Cost discipline: Track per task: model tier, token estimate, retries, wall-clock time, success/failure. Escalate model tier only when lower tier fails with clear reasoning gap, not on uncertainty.

References

references/eval-patterns.md — detailed eval-first loop patterns
references/decomposition-rules.md — 15-minute unit principle and task breakdown examples
references/review-checklist.md — what to focus on in code review (invariants, boundaries, security, coupling)

Agentic Engineering

Install

Agentic Engineering

Quick Start

Key Concepts

Common Usage

References

Related skills