Vibe Coding Mastery

The complete operating system for building software with AI. From first prompt to production deployment — prompting frameworks, architecture patterns, testing strategies, debugging playbooks, and production graduation checklists. Works with Claude Code, Cursor, Windsurf, Copilot, and any AI coding tool.

Audits

Pass

ClawScanPass

Agentic behavior and permission review.

Static analysisPass

Pattern checks against bundled files.

VirusTotalPass

Multi-engine malware detections and file reputation.

Install

openclaw skills install afrexai-vibe-coding

Vibe Coding Mastery

The complete system for building software with AI — from zero to production. Not tips. Not theory. A full operating methodology.

What is vibe coding? Programming where you describe what you want and let AI generate code. You evaluate by results, not by reading every line. Coined by Andrej Karpathy (Feb 2025).

Key distinction (Simon Willison): If you review, test, and explain the code — that's AI-assisted software development. Vibe coding means accepting AI output without fully understanding every function. This skill covers both modes and the spectrum between them.

Phase 1: When to Vibe (Decision Matrix)

Before starting, classify your project:

Factor	Vibe ✅	Don't Vibe ❌
Stakes	Low (prototype, internal, learning)	High (payments, auth, compliance)
Timeline	Hours to days	Months+
Team size	Solo or pair	Large team with standards
Domain knowledge	You understand the domain	Unfamiliar territory
Reversibility	Easy to rewrite	Hard to change later
Data sensitivity	Public/test data	PII, financial, health

Scoring: Count ✅ checks.

5-6: Full vibe mode. Ship fast.
3-4: Vibe with guardrails. Review critical paths.
1-2: AI-assisted development, not vibe coding. Review everything.
0: Write it yourself or hire someone who understands the domain.

Vibe Coding Maturity Levels

Level	Description	Who
L1 — Passenger	Copy-paste AI output, hope it works	Beginners
L2 — Navigator	Guide AI with context, catch obvious errors	Intermediate
L3 — Pilot	Architecture decisions, AI implements, you review	Experienced devs
L4 — Conductor	Orchestrate multiple AI sessions, parallel streams	Power users

Target: L3 minimum for anything going to production.

Phase 2: Tool Selection

Primary Tools Matrix

Tool	Best For	Context Window	Multi-file	Terminal	Cost
Claude Code	Full-stack, complex refactors, CLI	200K	Excellent	Native	API usage
Cursor	Editor-integrated, rapid iteration	128K	Good	Via terminal	$20/mo + API
Windsurf	Beginner-friendly, guided flows	128K	Good	Limited	$10/mo + API
GitHub Copilot	Inline completions, small edits	8-32K	Limited	No	$10-19/mo
Aider	Git-aware, open source, CLI	Varies	Good	Native	API only
Cline (VS Code)	VS Code native, plan mode	Varies	Good	Via terminal	API only

Multi-Tool Strategy

Use tools in combination:

Architecture & planning → Claude Code or Claude chat (largest context, best reasoning)
Implementation → Cursor or Claude Code (fast iteration, multi-file edits)
Quick fixes & completions → Copilot (inline, zero friction)
Code review → Claude chat (paste diffs, get thorough review)

Phase 3: Rules Files (Your Persistent Context)

Rules files teach AI your conventions once. Without them, every session starts from zero.

Universal Rules File Template

# Project Rules

## Stack
- Language: [TypeScript/Python/Go/etc.]
- Framework: [Next.js/FastAPI/etc.]
- Database: [PostgreSQL/SQLite/etc.]
- Styling: [Tailwind/CSS Modules/etc.]
- Package manager: [pnpm/npm/poetry/etc.]

## Code Style
- Max function length: 50 lines
- Max file length: 300 lines
- One export per file (prefer)
- Use [const/let, never var] / [type hints always]
- Error handling: [explicit try/catch, never swallow errors]
- Naming: [camelCase functions, PascalCase components, UPPER_SNAKE constants]

## Architecture
- File structure: [describe or reference]
- API pattern: [REST/tRPC/GraphQL]
- State management: [Zustand/Redux/signals/etc.]
- Auth pattern: [JWT/session/OAuth provider]

## Testing
- Framework: [Vitest/Jest/Pytest/etc.]
- Minimum coverage: [80%/90%/etc.]
- Test file location: [co-located/__tests__/tests/]
- Run before committing: [command]

## Do NOT
- Do not use `any` type in TypeScript
- Do not install new dependencies without asking
- Do not modify database schema without migration
- Do not hardcode secrets, URLs, or config values
- Do not remove existing tests

## When Unsure
- Ask before making architectural decisions
- Show the plan before implementing changes >100 lines
- Flag security-adjacent code for manual review

Where to Put It

Tool	File	Notes
Claude Code	`CLAUDE.md` in repo root	Also reads .claude/ directory
Cursor	`.cursor/rules/*.mdc`	Supports conditional rules with globs
Windsurf	`.windsurfrules` in repo root	Single file
Aider	`.aider.conf.yml` + conventions in chat	YAML config + initial prompt
Generic	`AGENTS.md` or `CONVENTIONS.md`	Any tool can be told to read it

Cursor Conditional Rules (.mdc)

---
description: React component standards
globs: src/components/**/*.tsx
alwaysApply: false
---

# Component Rules
- Functional components only (no class components)
- Props interface above component, named [Component]Props
- Use forwardRef for components that accept ref
- Co-locate styles in [component].module.css
- Co-locate tests in [component].test.tsx
- Export component as named export, not default

Rules File Quality Checklist

Stack and versions specified
File/function size limits defined
Naming conventions documented
"Do NOT" section with common AI mistakes
Testing expectations clear
Architecture patterns described or referenced
Security-sensitive areas flagged
Dependencies policy stated

Phase 4: Prompt Engineering for Code

The 5-Level Prompt Quality Hierarchy

Level 1 — Wish (bad)

"Build a todo app"

Level 2 — Request (okay)

"Build a todo app with React and Tailwind"

Level 3 — Specification (good)

"Build a todo app: React 18, TypeScript, Tailwind. Features: add/edit/delete/toggle todos. Store in localStorage. Responsive. Under 200 lines total."

Level 4 — Brief (great)

"Build a todo app. Here's the spec:

Stack: React 18 + TS + Tailwind + Vite

Features: CRUD todos, toggle complete, filter (all/active/done), persist to localStorage

Constraints: Single component file, under 200 lines, no external deps beyond stack

Done when: All features work, page refresh preserves state, mobile responsive

Start with the data types, then build up."

Level 5 — Contract (production-grade)

task: Todo application
stack:
  runtime: React 18 + TypeScript strict
  styling: Tailwind CSS 3.x
  build: Vite 5
  test: Vitest + Testing Library
features:
  - CRUD operations on todos
  - Toggle completion status
  - Filter: all | active | completed
  - Bulk actions: complete all, clear completed
  - Persist to localStorage with versioned schema
constraints:
  - Max 3 component files
  - Max 200 lines per file
  - No external state management library
  - Keyboard accessible (tab, enter, escape)
  - Mobile responsive (min 320px)
acceptance:
  - All features functional
  - Page refresh preserves state
  - 90%+ test coverage
  - No TypeScript errors (strict mode)
  - Lighthouse accessibility score > 90
approach: Start with types/interfaces, then hooks, then components, then tests.

12 Proven Prompt Patterns

Scaffolding: "Create the project structure with empty files and type definitions. Don't implement yet."
Incremental: "Implement only [specific function]. Don't touch other files."
Explain-then-build: "Explain how you'd architect this, then implement after I approve."
Test-first: "Write the tests first based on these requirements. Then implement to make them pass."
Refactor: "Refactor [file] to [goal]. Keep the same behavior. Don't add features."
Debug: [paste error] "This happens when [action]. Expected [behavior]. The relevant code is in [files]."
Review: "Review this code for [security/performance/readability]. Be specific about issues and fixes."
Migrate: "Convert this from [old pattern] to [new pattern]. Show me the plan first."
Document: "Add JSDoc/docstrings to all public functions in [file]. Include param types and examples."
Optimize: "This function is slow with >10K items. Profile, identify bottleneck, optimize. Keep same API."
Parallel session: "Read [these files] and summarize the architecture. Don't change anything."
Recovery: "The codebase is in a broken state. [describe symptoms]. Help me understand what went wrong before we fix it."

Anti-Patterns (What NOT to Prompt)

Anti-Pattern	Why It Fails	Fix
"Build me an app"	Too vague, AI guesses everything	Use Level 4+ prompts
"Fix it" (no context)	AI doesn't know what "it" is	Paste error + expected behavior
"Rewrite everything"	Nukes working code, introduces regressions	Incremental refactors
"Make it better"	Subjective, AI changes random things	Specify what "better" means
"Use best practices"	AI's "best practices" may not match your stack	Specify the practices you want
Multiple unrelated asks	Context bleed, partial implementations	One task per prompt
Long conversation chains	Context degrades after 10+ turns	Start fresh sessions

Phase 5: The RPIV Workflow

Research → Plan → Implement → Validate

Step 1: Research

"Read [files/docs/codebase]. Explain how [feature/module] works. Don't modify anything."

Purpose: Load context. Catch misunderstandings before they cascade. AI explains back to you — if the explanation is wrong, the implementation will be wrong too.

Step 2: Plan

"Based on your understanding, write a plan:

Which files you'll create/modify

What changes in each file

What order you'll implement

What could go wrong"

Purpose: Review the approach before committing to it. 10x cheaper to fix a plan than debug cascading implementation errors.

Plan Review Checklist:

Does it touch files it shouldn't?
Is the change order logical (types → utils → components → tests)?
Are there missing files or steps?
Does it respect existing patterns?
Did it flag risks/unknowns?

Step 3: Implement

"Proceed with the plan. Implement step by step. Stop after each file for me to verify."

The 200-Line Rule: If any single implementation step is >200 lines of changes, break it down further. Large changes = large bugs.

Checkpoint System:

After each file: quick scan for obvious issues
After each feature: run tests
After each milestone: manual test + commit

Step 4: Validate

"Run the tests. Show me the output. If anything fails, explain why and fix it."

Then manually verify:

Feature works as specified
Edge cases handled (empty state, max length, special chars)
No console errors
Mobile responsive (if UI)
Existing features still work (regression check)

Phase 6: Architecture for Vibe Coding

AI generates better code when your architecture is clear and consistent.

Recommended Project Structure

project/
├── CLAUDE.md (or .cursorrules)     # AI rules
├── README.md                        # What this is
├── src/
│   ├── types/                       # Shared types (AI reads these first)
│   │   ├── index.ts
│   │   └── [domain].ts
│   ├── lib/                         # Pure utilities (no side effects)
│   │   ├── [utility].ts
│   │   └── [utility].test.ts
│   ├── services/                    # External integrations (DB, API, etc.)
│   │   ├── [service].ts
│   │   └── [service].test.ts
│   ├── components/ (or routes/)     # UI or route handlers
│   │   ├── [Component]/
│   │   │   ├── index.tsx
│   │   │   ├── [Component].test.tsx
│   │   │   └── [Component].module.css
│   └── app/                         # App entry, layout, config
├── tests/                           # Integration/E2E tests
├── scripts/                         # Build/deploy/utility scripts
└── docs/                            # Architecture decisions, API docs

Vibe-Friendly Patterns

Types first. Define your data shapes before anything else. AI uses these as contracts.
Small files. 300 lines max. AI handles small files better — fewer hallucinations, cleaner diffs.
Explicit imports. No barrel exports (index.ts re-exports). AI gets confused by indirect imports.
Co-located tests. thing.ts + thing.test.ts side by side. AI writes tests when they're right there.
Config in one place. Environment, feature flags, constants — one file AI can reference.
Database schema as code. Drizzle/Prisma schema file = single source of truth AI can read.

Schema-First Design

// src/types/todo.ts — AI reads this and understands your domain
export interface Todo {
  id: string;          // UUID v4
  title: string;       // 1-200 chars, trimmed
  completed: boolean;  // default false
  createdAt: Date;
  updatedAt: Date;
}

export interface CreateTodoInput {
  title: string;       // Required, 1-200 chars
}

export interface UpdateTodoInput {
  title?: string;
  completed?: boolean;
}

// This is ALL AI needs to implement CRUD operations correctly.

Phase 7: Testing in Vibe Mode

The Vibe Testing Pyramid

         /  E2E  \        ← 10% (critical user flows only)
        / Integration \    ← 30% (API endpoints, DB queries)
       /    Unit Tests  \  ← 60% (pure functions, utils, logic)

Test-First Vibe Pattern

Prompt: "Write tests for a function that validates email addresses.
Requirements:
- Returns true for valid emails
- Returns false for empty string, missing @, missing domain
- Handles edge cases: plus addressing, subdomains, international domains
Write ONLY the tests. I'll implement after."

Then: "Now implement the function to make all tests pass."

This pattern produces better code because AI has clear acceptance criteria.

What to Test (Minimum Viable Testing)

Category	Test?	Why
Pure functions	Always	Easy, high value, catches logic bugs
Data transformations	Always	Wrong transforms corrupt data silently
API endpoints	Always	Contract verification
UI components	Sometimes	Test behavior, not implementation
Database queries	Sometimes	Test complex queries, skip simple CRUD
Config/env loading	Rarely	Test once, trust after
Third-party wrappers	Rarely	Test integration, not their code

When AI Tests Are Wrong

Signs of bad AI tests:

Tests that test the implementation, not the behavior
Tests that pass with any input (always return true)
Tests that mock everything (testing mocks, not code)
Snapshot tests for everything (brittle, meaningless)

Fix: "These tests mock too much. Write tests that exercise real behavior. Only mock external services (DB, API calls). Use in-memory alternatives where possible."

Phase 8: Debugging with AI

The Error Paste Pattern

What Karpathy does: Copy error, paste with no comment, AI usually fixes it.

When it works: Clear error messages, stack traces, type errors, syntax errors.

When it doesn't (and what to do instead):

Situation	Better Prompt
Vague runtime error	"When I [action], [behavior] happens. Expected [expected]. Here's the relevant code: [paste]"
Silent failure	"This function returns [wrong result] for input [input]. Expected [expected]. Walk me through the logic step by step."
Intermittent bug	"This works sometimes but fails with [condition]. I think it's a [race condition/state issue/timing problem]. Here's the code:"
Build/config error	Paste full error + your config files. "Don't guess — check the config values against the docs."
AI broke something while fixing	"Stop. Let's go back. The original issue was [X]. You introduced a new bug: [Y]. Let's fix the original issue without changing [Z]."

The 3-Strike Rule

If AI can't fix something in 3 attempts:

Stop. Don't keep asking the same thing.
Reframe. Describe the behavior you want, not the error.
Simplify. Create a minimal reproduction case.
Start fresh. New session, clean context.
Manual. Sometimes you need to read the code yourself.

Recovery Playbooks

Spaghetti Code (AI made a mess)

1. git stash (save current mess)
2. git checkout [last good commit]
3. Start a NEW AI session
4. Paste only the requirements, not the broken code
5. "Implement this from scratch following these patterns: [your conventions]"

Recurring Bug (Fix breaks something else)

1. Write a failing test for the bug
2. Write regression tests for the things that keep breaking
3. "Make ALL these tests pass. Don't modify the tests."

Dependency Hell

1. Check `package.json` / `requirements.txt` — AI sometimes adds conflicting deps
2. "List all dependencies you added and why each is needed"
3. Remove anything that duplicates existing functionality
4. Lock versions: "Pin all dependencies to exact versions"

Context Exhaustion (AI forgot earlier instructions)

1. Start a new session
2. Load rules file + key files
3. Summarize what's done and what remains
4. Continue with fresh context

Phase 9: Production Graduation Checklist

Before ANY vibe-coded project goes to production:

P0 — Security (Must fix)

No hardcoded secrets (grep for API keys, passwords, tokens)
Input validation on all user inputs (XSS, SQL injection, path traversal)
Authentication checks on protected routes
Authorization: users can only access their own data
HTTPS enforced
Dependencies: npm audit / pip audit — zero critical/high
Rate limiting on public endpoints
CORS configured (not * in production)
Error messages don't leak internals (no stack traces to users)

P1 — Performance (Should fix)

Database queries have indexes for common filters
No N+1 queries (check ORM query logs)
Images optimized (WebP, lazy load)
Bundle size reasonable (<200KB initial JS)
Loading states for async operations
Pagination for list endpoints (no unbounded queries)

P2 — Reliability (Should fix)

Error handling: try/catch on all async operations
Graceful degradation when services are down
Health check endpoint
Logging (structured, not console.log)
Environment config via env vars (not hardcoded)
Database migrations (not raw SQL)
Backup strategy for data

P3 — Quality (Nice to have)

Test coverage >80%
TypeScript strict mode / type hints everywhere
Linter configured and clean
README with setup instructions
CI pipeline runs tests on push

AI-Assisted Hardening Prompt:

"Review this codebase for production readiness. Check against this list: [paste checklist]. For each item, tell me: pass/fail/not applicable, and what to fix if fail. Be specific — file names and line numbers."

Phase 10: Advanced Patterns

Parallel AI Sessions

Run multiple AI sessions simultaneously:

Session A: Implementing backend API
Session B: Building frontend components
Session C: Writing tests

Rules for parallel sessions:

Define interfaces/types FIRST (shared contract)
Each session gets its own rules file section
Merge via git (commit each session's work to a branch)
Integration test after merging

Pair Programming Patterns

Navigator-Driver (you navigate, AI drives)

You: "We need to add pagination. The API should accept page and limit query params. Return items, total count, and hasNextPage." AI: [implements] You: "Good. Now add cursor-based pagination as an alternative. The cursor should be the last item's ID." AI: [implements]

Ping-Pong (alternate implementing)

You: Write the test AI: Write the implementation You: Write the next test AI: Write the next implementation (TDD style — extremely effective)

Rubber Duck (AI explains, you catch issues)

"Walk me through this code line by line. Explain what each function does, what could go wrong, and what assumptions you're making." (AI explains → you catch bad assumptions before they become bugs)

Context Window Management

Strategy	When	How
Fresh start	Every 15-20 turns	New session, reload rules + key files
Summarize	Before complex task	"Summarize what we've done. Then let's tackle [next thing]."
File focus	Large codebase	"Only look at src/services/auth.ts. Ignore everything else."
Memory file	Multi-session project	Keep PROGRESS.md with what's done/remaining

Git Workflow for Vibe Coding

# Before starting
git checkout -b feature/[name]
git status  # clean working tree

# During (commit often!)
git add -A && git commit -m "feat: [what AI just implemented]"
# Every 2-3 AI turns, commit. Your safety net.

# If things go wrong
git diff  # see what AI changed
git stash  # save mess
git checkout .  # nuclear option: discard all changes

# When done
git diff main..HEAD  # review ALL changes before merging

Phase 11: Common Mistakes & How to Avoid Them

#	Mistake	Consequence	Prevention
1	No rules file	AI reinvents conventions each session	Write rules file before first prompt
2	Prompting implementation before plan	Cascading wrong assumptions	Always: Research → Plan → Implement
3	Never reading AI's code	Hidden bugs, security holes, debt	Review at least critical paths
4	One giant prompt	AI loses focus, partial implementation	One task per prompt, sequential
5	Not committing frequently	Can't rollback when AI breaks things	Commit every 2-3 turns
6	Ignoring test failures	"It works on my machine"	Tests pass = done. Not before.
7	Letting AI add dependencies freely	Bloated bundle, version conflicts	"Don't add deps without asking" in rules
8	No production checklist	Ship security holes	Phase 9 checklist before deploy
9	Marathon AI sessions	Context degrades, AI "forgets"	Fresh session every 15-20 turns
10	Vibe coding auth/payments	Critical bugs in critical paths	Manual review for all security code
11	No types/schema	AI guesses data shapes differently each time	Define types FIRST, always
12	Trusting AI's "it works"	AI confidently ships broken code	Verify yourself. Run it. Test it.
13	Same prompt after 3 failures	AI stuck in a loop	Reframe, simplify, or do it manually
14	Mixing concerns in one session	Context pollution	One feature per session
15	No architecture guidance	AI creates inconsistent patterns	Document patterns in rules file

Phase 12: Weekly Effectiveness Tracking

Track your vibe coding quality over time:

week_of: "YYYY-MM-DD"
sessions: [count]
features_shipped: [count]
bugs_introduced: [count]  # found post-ship
bugs_caught_in_review: [count]  # caught before ship
avg_prompts_per_feature: [count]
time_saved_estimate_hours: [number]
fresh_session_restarts: [count]

# Score yourself (1-5):
prompt_quality: [1-5]      # Are you using Level 4+ prompts?
review_discipline: [1-5]   # Are you reviewing critical code?
testing_rigor: [1-5]       # Are you testing before shipping?
architecture: [1-5]        # Is the codebase staying clean?
commit_frequency: [1-5]    # Are you committing every 2-3 turns?

total_score: [5-25]

Score	Rating	Action
20-25	Elite	You're a vibe coding conductor. Teach others.
15-19	Solid	Good habits. Focus on weakest dimension.
10-14	Learning	Review this guide weekly. Build the habits.
5-9	Risky	Slow down. More planning, more testing, more review.

The 10 Commandments of Vibe Coding

Types first. Define your data before writing logic.
Rules file always. No rules = no consistency.
Plan before implement. 5 minutes planning saves 5 hours debugging.
One task per prompt. Focus = quality.
Commit after every win. Git is your safety net.
Test the critical path. At minimum: happy path + one edge case.
Fresh sessions. Don't let context rot.
Review security code. Auth, payments, data access — always manual review.
200-line rule. If a change is bigger, break it down.
Know when to stop vibing. If AI can't fix it in 3 tries, change approach.

Quick Reference Commands

"Read [files] and explain the architecture. Don't change anything."
"Write a plan for [feature]. List files to create/modify and changes in each."
"Implement only [specific thing]. Don't touch other files."
"Write tests first for [requirements]. Then implement to pass them."
"Review this for [security/performance/readability]. Be specific."
"This error occurs when [action]. Expected [behavior]. Here's the code: [paste]"
"Refactor [file] to [goal]. Same behavior. Don't add features."
"What dependencies did you add and why? Remove anything unnecessary."
"Walk me through this code. Explain assumptions and potential issues."
"Stop. The original issue was [X]. Let's start fresh with a minimal approach."
"Run all tests. If any fail, fix them without breaking other tests."
"Check this against the production checklist: [paste P0-P3 items]."

Built by AfrexAI — the team that ships AI agents, not just AI prompts.