database-migrations

Safe, zero-downtime database migration strategies — schema evolution, rollback planning, data migration, tooling, and anti-pattern avoidance for production systems. Use when planning schema changes, writing migrations, or reviewing migration safety.

wpank@wpank

Install

openclaw skills install @wpank/database-migrations

Database Migration Patterns

Schema Evolution Strategies

Strategy	Risk	Downtime	Best For
Additive-Only	Very Low	None	APIs with backward-compatibility guarantees
Expand-Contract	Low	None	Renaming, restructuring, type changes
Parallel Change	Low	None	High-risk changes on critical tables
Lazy Migration	Medium	None	Large tables where bulk migration is too slow
Big Bang	High	Yes	Dev/staging or small datasets only

Default to Additive-Only. Escalate to Expand-Contract only when you must modify or remove existing structures.

Zero-Downtime Patterns

Every production migration must avoid locking tables or breaking running application code.

Operation	Pattern	Key Constraint
Add column	Nullable first	Never add `NOT NULL` without default on large tables
Rename column	Expand-contract	Add new → dual-write → backfill → switch reads → drop old
Drop column	Deprecate first	Stop reading → stop writing → deploy → drop
Change type	Parallel column	Add new type → dual-write + cast → switch → drop old
Add index	Concurrent	`CREATE INDEX CONCURRENTLY` — don't wrap in transaction
Split table	Extract + FK	Create new → backfill → add FK → update queries → drop old columns
Change constraint	Two-phase	Add `NOT VALID` → `VALIDATE CONSTRAINT` separately
Add enum value	Append only	Never remove or rename existing values

Migration Tools

Tool	Ecosystem	Style	Key Strength
Prisma Migrate	TypeScript/Node	Declarative (schema diff)	ORM integration, shadow DB
Knex	JavaScript/Node	Imperative (up/down)	Lightweight, flexible
Drizzle Kit	TypeScript/Node	Declarative (schema diff)	Type-safe, SQL-like
Alembic	Python	Imperative (upgrade/downgrade)	Granular control, autogenerate
Django Migrations	Python/Django	Declarative (model diff)	Auto-detection
Flyway	JVM / CLI	SQL file versioning	Simple, wide DB support
golang-migrate	Go / CLI	SQL (up/down files)	Minimal, embeddable
Atlas	Go / CLI	Declarative (HCL/SQL diff)	Schema-as-code, linting, CI

Match the tool to your ORM and deployment pipeline. Prefer declarative for simple schemas, imperative for fine-grained data manipulation.

Rollback Strategies

Approach	When to Use
Reversible (up + down)	Schema-only changes, early-stage products
Forward-only (corrective migration)	Data-destructive changes, production at scale
Hybrid	Reversible for schema, forward-only for data

Data Preservation

Soft-delete columns — rename with _deprecated suffix instead of dropping
Snapshot tables — CREATE TABLE _backup_<table>_<date> AS SELECT * FROM <table>
Point-in-time recovery — ensure WAL archiving covers migration windows
Logical backups — pg_dump of affected tables before migration

Blue-Green Database

text

1. Replicate primary → secondary (green)
2. Apply migration to green
3. Run validation suite against green
4. Switch traffic to green
5. Keep blue as rollback target (N hours)
6. Decommission blue after confidence window

Data Migration Patterns

Backfill Strategies

Strategy	Best For
Inline backfill	Small tables (< 100K rows)
Batched backfill	Medium tables (100K–10M rows)
Background job	Large tables (10M+ rows)
Lazy backfill	When immediate consistency not required

Batch Processing

sql

DO $$
DECLARE
  batch_size INT := 1000;
  rows_updated INT;
BEGIN
  LOOP
    UPDATE my_table
    SET new_col = compute_value(old_col)
    WHERE id IN (
      SELECT id FROM my_table
      WHERE new_col IS NULL
      LIMIT batch_size
      FOR UPDATE SKIP LOCKED
    );
    GET DIAGNOSTICS rows_updated = ROW_COUNT;
    EXIT WHEN rows_updated = 0;
    PERFORM pg_sleep(0.1);  -- throttle to reduce lock pressure
    COMMIT;
  END LOOP;
END $$;

Dual-Write Period

For expand-contract and parallel change:

Dual-write — application writes to both old and new columns/tables
Backfill — fill new structure with historical data
Verify — assert consistency (row counts, checksums)
Cut over — switch reads to new, stop writing to old
Cleanup — drop old structure after cool-down period

Testing Migrations

Test Against Production-Like Data

Never test against empty or synthetic data only
Use anonymized production snapshots
Match data volume — a migration working on 1K rows may lock on 10M
Reproduce edge cases: NULLs, empty strings, max-length, unicode

Migration CI Pipeline

yaml

- name: Test migrations
  steps:
    - run: docker compose up -d db
    - run: npm run migrate:up        # apply all
    - run: npm run migrate:down      # rollback all
    - run: npm run migrate:up        # re-apply (idempotency)
    - run: npm run test:integration  # validate app
    - run: npm run migrate:status    # no pending

Every migration PR must pass: up → down → up → tests.

Migration Checklist

Pre-Migration

Tested against production-like data volume
Rollback written and tested
Backup of affected tables created
App code compatible with both old and new schema
Execution time benchmarked on staging
Lock impact analyzed
Replication lag monitoring in place

During Migration

Monitor lock waits and active queries
Monitor replication lag
Watch for error rate spikes
Keep rollback command ready

Post-Migration

Schema matches expected state
Integration tests pass against migrated DB
Data integrity validated (row counts, checksums)
ORM schema / type definitions updated
Deprecated structures cleaned up after cool-down
Migration documented in team runbook

NEVER Do

NEVER run untested migrations directly in production
NEVER drop a column without first removing all application references and deploying
NEVER add NOT NULL to a large table without a default value in a single statement
NEVER mix schema DDL and data mutations in the same migration file
NEVER skip the dual-write phase when renaming columns in a live system
NEVER assume migrations are instantaneous — always benchmark on production-scale data
NEVER disable foreign key checks to "speed up" migrations in production
NEVER deploy application code that depends on a schema change before the migration has completed