database-migrations
Safe, zero-downtime database migration strategies — schema evolution, rollback planning, data migration, tooling, and anti-pattern avoidance for production systems. Use when planning schema changes, writing migrations, or reviewing migration safety.
Like a lobster shell, security has layers — review code before you run it.
License
SKILL.md
Database Migration Patterns
Schema Evolution Strategies
| Strategy | Risk | Downtime | Best For |
|---|---|---|---|
| Additive-Only | Very Low | None | APIs with backward-compatibility guarantees |
| Expand-Contract | Low | None | Renaming, restructuring, type changes |
| Parallel Change | Low | None | High-risk changes on critical tables |
| Lazy Migration | Medium | None | Large tables where bulk migration is too slow |
| Big Bang | High | Yes | Dev/staging or small datasets only |
Default to Additive-Only. Escalate to Expand-Contract only when you must modify or remove existing structures.
Zero-Downtime Patterns
Every production migration must avoid locking tables or breaking running application code.
| Operation | Pattern | Key Constraint |
|---|---|---|
| Add column | Nullable first | Never add NOT NULL without default on large tables |
| Rename column | Expand-contract | Add new → dual-write → backfill → switch reads → drop old |
| Drop column | Deprecate first | Stop reading → stop writing → deploy → drop |
| Change type | Parallel column | Add new type → dual-write + cast → switch → drop old |
| Add index | Concurrent | CREATE INDEX CONCURRENTLY — don't wrap in transaction |
| Split table | Extract + FK | Create new → backfill → add FK → update queries → drop old columns |
| Change constraint | Two-phase | Add NOT VALID → VALIDATE CONSTRAINT separately |
| Add enum value | Append only | Never remove or rename existing values |
Migration Tools
| Tool | Ecosystem | Style | Key Strength |
|---|---|---|---|
| Prisma Migrate | TypeScript/Node | Declarative (schema diff) | ORM integration, shadow DB |
| Knex | JavaScript/Node | Imperative (up/down) | Lightweight, flexible |
| Drizzle Kit | TypeScript/Node | Declarative (schema diff) | Type-safe, SQL-like |
| Alembic | Python | Imperative (upgrade/downgrade) | Granular control, autogenerate |
| Django Migrations | Python/Django | Declarative (model diff) | Auto-detection |
| Flyway | JVM / CLI | SQL file versioning | Simple, wide DB support |
| golang-migrate | Go / CLI | SQL (up/down files) | Minimal, embeddable |
| Atlas | Go / CLI | Declarative (HCL/SQL diff) | Schema-as-code, linting, CI |
Match the tool to your ORM and deployment pipeline. Prefer declarative for simple schemas, imperative for fine-grained data manipulation.
Rollback Strategies
| Approach | When to Use |
|---|---|
| Reversible (up + down) | Schema-only changes, early-stage products |
| Forward-only (corrective migration) | Data-destructive changes, production at scale |
| Hybrid | Reversible for schema, forward-only for data |
Data Preservation
- Soft-delete columns — rename with
_deprecatedsuffix instead of dropping - Snapshot tables —
CREATE TABLE _backup_<table>_<date> AS SELECT * FROM <table> - Point-in-time recovery — ensure WAL archiving covers migration windows
- Logical backups —
pg_dumpof affected tables before migration
Blue-Green Database
1. Replicate primary → secondary (green)
2. Apply migration to green
3. Run validation suite against green
4. Switch traffic to green
5. Keep blue as rollback target (N hours)
6. Decommission blue after confidence window
Data Migration Patterns
Backfill Strategies
| Strategy | Best For |
|---|---|
| Inline backfill | Small tables (< 100K rows) |
| Batched backfill | Medium tables (100K–10M rows) |
| Background job | Large tables (10M+ rows) |
| Lazy backfill | When immediate consistency not required |
Batch Processing
DO $$
DECLARE
batch_size INT := 1000;
rows_updated INT;
BEGIN
LOOP
UPDATE my_table
SET new_col = compute_value(old_col)
WHERE id IN (
SELECT id FROM my_table
WHERE new_col IS NULL
LIMIT batch_size
FOR UPDATE SKIP LOCKED
);
GET DIAGNOSTICS rows_updated = ROW_COUNT;
EXIT WHEN rows_updated = 0;
PERFORM pg_sleep(0.1); -- throttle to reduce lock pressure
COMMIT;
END LOOP;
END $$;
Dual-Write Period
For expand-contract and parallel change:
- Dual-write — application writes to both old and new columns/tables
- Backfill — fill new structure with historical data
- Verify — assert consistency (row counts, checksums)
- Cut over — switch reads to new, stop writing to old
- Cleanup — drop old structure after cool-down period
Testing Migrations
Test Against Production-Like Data
- Never test against empty or synthetic data only
- Use anonymized production snapshots
- Match data volume — a migration working on 1K rows may lock on 10M
- Reproduce edge cases: NULLs, empty strings, max-length, unicode
Migration CI Pipeline
- name: Test migrations
steps:
- run: docker compose up -d db
- run: npm run migrate:up # apply all
- run: npm run migrate:down # rollback all
- run: npm run migrate:up # re-apply (idempotency)
- run: npm run test:integration # validate app
- run: npm run migrate:status # no pending
Every migration PR must pass: up → down → up → tests.
Migration Checklist
Pre-Migration
- Tested against production-like data volume
- Rollback written and tested
- Backup of affected tables created
- App code compatible with both old and new schema
- Execution time benchmarked on staging
- Lock impact analyzed
- Replication lag monitoring in place
During Migration
- Monitor lock waits and active queries
- Monitor replication lag
- Watch for error rate spikes
- Keep rollback command ready
Post-Migration
- Schema matches expected state
- Integration tests pass against migrated DB
- Data integrity validated (row counts, checksums)
- ORM schema / type definitions updated
- Deprecated structures cleaned up after cool-down
- Migration documented in team runbook
NEVER Do
- NEVER run untested migrations directly in production
- NEVER drop a column without first removing all application references and deploying
- NEVER add
NOT NULLto a large table without a default value in a single statement - NEVER mix schema DDL and data mutations in the same migration file
- NEVER skip the dual-write phase when renaming columns in a live system
- NEVER assume migrations are instantaneous — always benchmark on production-scale data
- NEVER disable foreign key checks to "speed up" migrations in production
- NEVER deploy application code that depends on a schema change before the migration has completed
Files
2 totalComments
Loading comments…
