Install
openclaw skills install afrexai-vibe-codingThe complete operating system for building software with AI. From first prompt to production deployment — prompting frameworks, architecture patterns, testing strategies, debugging playbooks, and production graduation checklists. Works with Claude Code, Cursor, Windsurf, Copilot, and any AI coding tool.
openclaw skills install afrexai-vibe-codingThe complete system for building software with AI — from zero to production. Not tips. Not theory. A full operating methodology.
What is vibe coding? Programming where you describe what you want and let AI generate code. You evaluate by results, not by reading every line. Coined by Andrej Karpathy (Feb 2025).
Key distinction (Simon Willison): If you review, test, and explain the code — that's AI-assisted software development. Vibe coding means accepting AI output without fully understanding every function. This skill covers both modes and the spectrum between them.
Before starting, classify your project:
| Factor | Vibe ✅ | Don't Vibe ❌ |
|---|---|---|
| Stakes | Low (prototype, internal, learning) | High (payments, auth, compliance) |
| Timeline | Hours to days | Months+ |
| Team size | Solo or pair | Large team with standards |
| Domain knowledge | You understand the domain | Unfamiliar territory |
| Reversibility | Easy to rewrite | Hard to change later |
| Data sensitivity | Public/test data | PII, financial, health |
Scoring: Count ✅ checks.
| Level | Description | Who |
|---|---|---|
| L1 — Passenger | Copy-paste AI output, hope it works | Beginners |
| L2 — Navigator | Guide AI with context, catch obvious errors | Intermediate |
| L3 — Pilot | Architecture decisions, AI implements, you review | Experienced devs |
| L4 — Conductor | Orchestrate multiple AI sessions, parallel streams | Power users |
Target: L3 minimum for anything going to production.
| Tool | Best For | Context Window | Multi-file | Terminal | Cost |
|---|---|---|---|---|---|
| Claude Code | Full-stack, complex refactors, CLI | 200K | Excellent | Native | API usage |
| Cursor | Editor-integrated, rapid iteration | 128K | Good | Via terminal | $20/mo + API |
| Windsurf | Beginner-friendly, guided flows | 128K | Good | Limited | $10/mo + API |
| GitHub Copilot | Inline completions, small edits | 8-32K | Limited | No | $10-19/mo |
| Aider | Git-aware, open source, CLI | Varies | Good | Native | API only |
| Cline (VS Code) | VS Code native, plan mode | Varies | Good | Via terminal | API only |
Use tools in combination:
Rules files teach AI your conventions once. Without them, every session starts from zero.
# Project Rules
## Stack
- Language: [TypeScript/Python/Go/etc.]
- Framework: [Next.js/FastAPI/etc.]
- Database: [PostgreSQL/SQLite/etc.]
- Styling: [Tailwind/CSS Modules/etc.]
- Package manager: [pnpm/npm/poetry/etc.]
## Code Style
- Max function length: 50 lines
- Max file length: 300 lines
- One export per file (prefer)
- Use [const/let, never var] / [type hints always]
- Error handling: [explicit try/catch, never swallow errors]
- Naming: [camelCase functions, PascalCase components, UPPER_SNAKE constants]
## Architecture
- File structure: [describe or reference]
- API pattern: [REST/tRPC/GraphQL]
- State management: [Zustand/Redux/signals/etc.]
- Auth pattern: [JWT/session/OAuth provider]
## Testing
- Framework: [Vitest/Jest/Pytest/etc.]
- Minimum coverage: [80%/90%/etc.]
- Test file location: [co-located/__tests__/tests/]
- Run before committing: [command]
## Do NOT
- Do not use `any` type in TypeScript
- Do not install new dependencies without asking
- Do not modify database schema without migration
- Do not hardcode secrets, URLs, or config values
- Do not remove existing tests
## When Unsure
- Ask before making architectural decisions
- Show the plan before implementing changes >100 lines
- Flag security-adjacent code for manual review
| Tool | File | Notes |
|---|---|---|
| Claude Code | CLAUDE.md in repo root | Also reads .claude/ directory |
| Cursor | .cursor/rules/*.mdc | Supports conditional rules with globs |
| Windsurf | .windsurfrules in repo root | Single file |
| Aider | .aider.conf.yml + conventions in chat | YAML config + initial prompt |
| Generic | AGENTS.md or CONVENTIONS.md | Any tool can be told to read it |
---
description: React component standards
globs: src/components/**/*.tsx
alwaysApply: false
---
# Component Rules
- Functional components only (no class components)
- Props interface above component, named [Component]Props
- Use forwardRef for components that accept ref
- Co-locate styles in [component].module.css
- Co-locate tests in [component].test.tsx
- Export component as named export, not default
Level 1 — Wish (bad)
"Build a todo app"
Level 2 — Request (okay)
"Build a todo app with React and Tailwind"
Level 3 — Specification (good)
"Build a todo app: React 18, TypeScript, Tailwind. Features: add/edit/delete/toggle todos. Store in localStorage. Responsive. Under 200 lines total."
Level 4 — Brief (great)
"Build a todo app. Here's the spec:
- Stack: React 18 + TS + Tailwind + Vite
- Features: CRUD todos, toggle complete, filter (all/active/done), persist to localStorage
- Constraints: Single component file, under 200 lines, no external deps beyond stack
- Done when: All features work, page refresh preserves state, mobile responsive
- Start with the data types, then build up."
Level 5 — Contract (production-grade)
task: Todo application
stack:
runtime: React 18 + TypeScript strict
styling: Tailwind CSS 3.x
build: Vite 5
test: Vitest + Testing Library
features:
- CRUD operations on todos
- Toggle completion status
- Filter: all | active | completed
- Bulk actions: complete all, clear completed
- Persist to localStorage with versioned schema
constraints:
- Max 3 component files
- Max 200 lines per file
- No external state management library
- Keyboard accessible (tab, enter, escape)
- Mobile responsive (min 320px)
acceptance:
- All features functional
- Page refresh preserves state
- 90%+ test coverage
- No TypeScript errors (strict mode)
- Lighthouse accessibility score > 90
approach: Start with types/interfaces, then hooks, then components, then tests.
| Anti-Pattern | Why It Fails | Fix |
|---|---|---|
| "Build me an app" | Too vague, AI guesses everything | Use Level 4+ prompts |
| "Fix it" (no context) | AI doesn't know what "it" is | Paste error + expected behavior |
| "Rewrite everything" | Nukes working code, introduces regressions | Incremental refactors |
| "Make it better" | Subjective, AI changes random things | Specify what "better" means |
| "Use best practices" | AI's "best practices" may not match your stack | Specify the practices you want |
| Multiple unrelated asks | Context bleed, partial implementations | One task per prompt |
| Long conversation chains | Context degrades after 10+ turns | Start fresh sessions |
Research → Plan → Implement → Validate
"Read [files/docs/codebase]. Explain how [feature/module] works. Don't modify anything."
Purpose: Load context. Catch misunderstandings before they cascade. AI explains back to you — if the explanation is wrong, the implementation will be wrong too.
"Based on your understanding, write a plan:
- Which files you'll create/modify
- What changes in each file
- What order you'll implement
- What could go wrong"
Purpose: Review the approach before committing to it. 10x cheaper to fix a plan than debug cascading implementation errors.
Plan Review Checklist:
"Proceed with the plan. Implement step by step. Stop after each file for me to verify."
The 200-Line Rule: If any single implementation step is >200 lines of changes, break it down further. Large changes = large bugs.
Checkpoint System:
"Run the tests. Show me the output. If anything fails, explain why and fix it."
Then manually verify:
AI generates better code when your architecture is clear and consistent.
project/
├── CLAUDE.md (or .cursorrules) # AI rules
├── README.md # What this is
├── src/
│ ├── types/ # Shared types (AI reads these first)
│ │ ├── index.ts
│ │ └── [domain].ts
│ ├── lib/ # Pure utilities (no side effects)
│ │ ├── [utility].ts
│ │ └── [utility].test.ts
│ ├── services/ # External integrations (DB, API, etc.)
│ │ ├── [service].ts
│ │ └── [service].test.ts
│ ├── components/ (or routes/) # UI or route handlers
│ │ ├── [Component]/
│ │ │ ├── index.tsx
│ │ │ ├── [Component].test.tsx
│ │ │ └── [Component].module.css
│ └── app/ # App entry, layout, config
├── tests/ # Integration/E2E tests
├── scripts/ # Build/deploy/utility scripts
└── docs/ # Architecture decisions, API docs
thing.ts + thing.test.ts side by side. AI writes tests when they're right there.// src/types/todo.ts — AI reads this and understands your domain
export interface Todo {
id: string; // UUID v4
title: string; // 1-200 chars, trimmed
completed: boolean; // default false
createdAt: Date;
updatedAt: Date;
}
export interface CreateTodoInput {
title: string; // Required, 1-200 chars
}
export interface UpdateTodoInput {
title?: string;
completed?: boolean;
}
// This is ALL AI needs to implement CRUD operations correctly.
/ E2E \ ← 10% (critical user flows only)
/ Integration \ ← 30% (API endpoints, DB queries)
/ Unit Tests \ ← 60% (pure functions, utils, logic)
Prompt: "Write tests for a function that validates email addresses.
Requirements:
- Returns true for valid emails
- Returns false for empty string, missing @, missing domain
- Handles edge cases: plus addressing, subdomains, international domains
Write ONLY the tests. I'll implement after."
Then: "Now implement the function to make all tests pass."
This pattern produces better code because AI has clear acceptance criteria.
| Category | Test? | Why |
|---|---|---|
| Pure functions | Always | Easy, high value, catches logic bugs |
| Data transformations | Always | Wrong transforms corrupt data silently |
| API endpoints | Always | Contract verification |
| UI components | Sometimes | Test behavior, not implementation |
| Database queries | Sometimes | Test complex queries, skip simple CRUD |
| Config/env loading | Rarely | Test once, trust after |
| Third-party wrappers | Rarely | Test integration, not their code |
Signs of bad AI tests:
Fix: "These tests mock too much. Write tests that exercise real behavior. Only mock external services (DB, API calls). Use in-memory alternatives where possible."
What Karpathy does: Copy error, paste with no comment, AI usually fixes it.
When it works: Clear error messages, stack traces, type errors, syntax errors.
When it doesn't (and what to do instead):
| Situation | Better Prompt |
|---|---|
| Vague runtime error | "When I [action], [behavior] happens. Expected [expected]. Here's the relevant code: [paste]" |
| Silent failure | "This function returns [wrong result] for input [input]. Expected [expected]. Walk me through the logic step by step." |
| Intermittent bug | "This works sometimes but fails with [condition]. I think it's a [race condition/state issue/timing problem]. Here's the code:" |
| Build/config error | Paste full error + your config files. "Don't guess — check the config values against the docs." |
| AI broke something while fixing | "Stop. Let's go back. The original issue was [X]. You introduced a new bug: [Y]. Let's fix the original issue without changing [Z]." |
If AI can't fix something in 3 attempts:
Spaghetti Code (AI made a mess)
1. git stash (save current mess)
2. git checkout [last good commit]
3. Start a NEW AI session
4. Paste only the requirements, not the broken code
5. "Implement this from scratch following these patterns: [your conventions]"
Recurring Bug (Fix breaks something else)
1. Write a failing test for the bug
2. Write regression tests for the things that keep breaking
3. "Make ALL these tests pass. Don't modify the tests."
Dependency Hell
1. Check `package.json` / `requirements.txt` — AI sometimes adds conflicting deps
2. "List all dependencies you added and why each is needed"
3. Remove anything that duplicates existing functionality
4. Lock versions: "Pin all dependencies to exact versions"
Context Exhaustion (AI forgot earlier instructions)
1. Start a new session
2. Load rules file + key files
3. Summarize what's done and what remains
4. Continue with fresh context
Before ANY vibe-coded project goes to production:
npm audit / pip audit — zero critical/high* in production)AI-Assisted Hardening Prompt:
"Review this codebase for production readiness. Check against this list: [paste checklist]. For each item, tell me: pass/fail/not applicable, and what to fix if fail. Be specific — file names and line numbers."
Run multiple AI sessions simultaneously:
Rules for parallel sessions:
Navigator-Driver (you navigate, AI drives)
You: "We need to add pagination. The API should accept page and limit query params. Return items, total count, and hasNextPage." AI: [implements] You: "Good. Now add cursor-based pagination as an alternative. The cursor should be the last item's ID." AI: [implements]
Ping-Pong (alternate implementing)
You: Write the test AI: Write the implementation You: Write the next test AI: Write the next implementation (TDD style — extremely effective)
Rubber Duck (AI explains, you catch issues)
"Walk me through this code line by line. Explain what each function does, what could go wrong, and what assumptions you're making." (AI explains → you catch bad assumptions before they become bugs)
| Strategy | When | How |
|---|---|---|
| Fresh start | Every 15-20 turns | New session, reload rules + key files |
| Summarize | Before complex task | "Summarize what we've done. Then let's tackle [next thing]." |
| File focus | Large codebase | "Only look at src/services/auth.ts. Ignore everything else." |
| Memory file | Multi-session project | Keep PROGRESS.md with what's done/remaining |
# Before starting
git checkout -b feature/[name]
git status # clean working tree
# During (commit often!)
git add -A && git commit -m "feat: [what AI just implemented]"
# Every 2-3 AI turns, commit. Your safety net.
# If things go wrong
git diff # see what AI changed
git stash # save mess
git checkout . # nuclear option: discard all changes
# When done
git diff main..HEAD # review ALL changes before merging
| # | Mistake | Consequence | Prevention |
|---|---|---|---|
| 1 | No rules file | AI reinvents conventions each session | Write rules file before first prompt |
| 2 | Prompting implementation before plan | Cascading wrong assumptions | Always: Research → Plan → Implement |
| 3 | Never reading AI's code | Hidden bugs, security holes, debt | Review at least critical paths |
| 4 | One giant prompt | AI loses focus, partial implementation | One task per prompt, sequential |
| 5 | Not committing frequently | Can't rollback when AI breaks things | Commit every 2-3 turns |
| 6 | Ignoring test failures | "It works on my machine" | Tests pass = done. Not before. |
| 7 | Letting AI add dependencies freely | Bloated bundle, version conflicts | "Don't add deps without asking" in rules |
| 8 | No production checklist | Ship security holes | Phase 9 checklist before deploy |
| 9 | Marathon AI sessions | Context degrades, AI "forgets" | Fresh session every 15-20 turns |
| 10 | Vibe coding auth/payments | Critical bugs in critical paths | Manual review for all security code |
| 11 | No types/schema | AI guesses data shapes differently each time | Define types FIRST, always |
| 12 | Trusting AI's "it works" | AI confidently ships broken code | Verify yourself. Run it. Test it. |
| 13 | Same prompt after 3 failures | AI stuck in a loop | Reframe, simplify, or do it manually |
| 14 | Mixing concerns in one session | Context pollution | One feature per session |
| 15 | No architecture guidance | AI creates inconsistent patterns | Document patterns in rules file |
Track your vibe coding quality over time:
week_of: "YYYY-MM-DD"
sessions: [count]
features_shipped: [count]
bugs_introduced: [count] # found post-ship
bugs_caught_in_review: [count] # caught before ship
avg_prompts_per_feature: [count]
time_saved_estimate_hours: [number]
fresh_session_restarts: [count]
# Score yourself (1-5):
prompt_quality: [1-5] # Are you using Level 4+ prompts?
review_discipline: [1-5] # Are you reviewing critical code?
testing_rigor: [1-5] # Are you testing before shipping?
architecture: [1-5] # Is the codebase staying clean?
commit_frequency: [1-5] # Are you committing every 2-3 turns?
total_score: [5-25]
| Score | Rating | Action |
|---|---|---|
| 20-25 | Elite | You're a vibe coding conductor. Teach others. |
| 15-19 | Solid | Good habits. Focus on weakest dimension. |
| 10-14 | Learning | Review this guide weekly. Build the habits. |
| 5-9 | Risky | Slow down. More planning, more testing, more review. |
"Read [files] and explain the architecture. Don't change anything."
"Write a plan for [feature]. List files to create/modify and changes in each."
"Implement only [specific thing]. Don't touch other files."
"Write tests first for [requirements]. Then implement to pass them."
"Review this for [security/performance/readability]. Be specific."
"This error occurs when [action]. Expected [behavior]. Here's the code: [paste]"
"Refactor [file] to [goal]. Same behavior. Don't add features."
"What dependencies did you add and why? Remove anything unnecessary."
"Walk me through this code. Explain assumptions and potential issues."
"Stop. The original issue was [X]. Let's start fresh with a minimal approach."
"Run all tests. If any fail, fix them without breaking other tests."
"Check this against the production checklist: [paste P0-P3 items]."
Built by AfrexAI — the team that ships AI agents, not just AI prompts.