Install
openclaw skills install phy-pr-size-splitterPull request size analyzer and split planner. Measures diff stats for any open or local PR using the gh CLI, classifies it as atomic / reviewable / oversized / unmergeable against configurable line thresholds, detects independent logical chunks (schema migrations, business logic, test additions, config changes, refactors), and proposes a named split plan with branch names and dependency order. Generates a ready-to-post PR comment with the split checklist. Works on open GitHub PRs, local git branches, or raw diff files. Zero external API — uses only the local gh CLI and git diff. Triggers on "PR too large", "split this PR", "pr size", "review this diff", "break up this PR", "atomic commits", "/pr-size".
openclaw skills install phy-pr-size-splitterA PR with 800 lines changed catches half as many bugs as four 200-line PRs — and takes four times as long to review.
This skill measures your PR, identifies independent logical units in the diff, and gives you a concrete split plan with branch names, PR titles, and dependency order. Copy the output directly into a PR comment.
Works on open GitHub PRs, local branches, or raw diff files. Zero external API.
# Option 1: Analyze an open GitHub PR
/pr-size 1234
/pr-size https://github.com/org/repo/pull/1234
# Option 2: Analyze current branch vs main
/pr-size
# Option 3: Analyze a specific branch
/pr-size feature/my-big-feature
# Option 4: Analyze against a specific base
/pr-size --base main --head feature/my-big-feature
# Option 5: Set custom size thresholds
/pr-size --atomic 150 --reviewable 400 --oversized 800
# Option 6: Output split plan as GitHub PR comment
/pr-size 1234 --post-comment
# Option 7: Just get the size classification (no split plan)
/pr-size --classify-only
# For a GitHub PR number
PR_NUMBER=1234
# Get diff stats via gh CLI
gh pr diff $PR_NUMBER --stat 2>/dev/null || \
gh pr view $PR_NUMBER --json files --jq '.files[] | "\(.additions) \(.deletions) \(.path)"'
# Get summary numbers
gh pr view $PR_NUMBER --json additions,deletions,changedFiles \
--jq '"Files: \(.changedFiles) | +\(.additions) -\(.deletions) | Total: \(.additions + .deletions)"'
# For a local branch (vs main)
git diff main...HEAD --stat
git diff main...HEAD --shortstat
For raw diff from stdin or file:
# Count lines from a saved diff
git diff main...HEAD | python3 -c "
import sys
added = deleted = files = 0
for line in sys.stdin:
if line.startswith('+++') or line.startswith('---'):
if line[4:] != '/dev/null\n': files += 1 if line.startswith('+++') else 0
elif line.startswith('+'): added += 1
elif line.startswith('-'): deleted += 1
print(f'Files: {files//2 or 1} +{added} -{deleted} Total: {added+deleted}')
"
def classify_pr_size(files: int, additions: int, deletions: int,
thresholds=None) -> dict:
"""Classify PR size and predict review quality impact."""
thresholds = thresholds or {
'atomic': {'lines': 150, 'files': 10},
'reviewable':{'lines': 400, 'files': 25},
'oversized': {'lines': 800, 'files': 50},
# above oversized = unmergeable
}
total = additions + deletions
if total <= thresholds['atomic']['lines'] and files <= thresholds['atomic']['files']:
tier = 'ATOMIC'
emoji = '✅'
review_time = '< 15 min'
bug_catch_rate = '~90%'
recommendation = 'Good to go — this PR is easy to review.'
elif total <= thresholds['reviewable']['lines'] and files <= thresholds['reviewable']['files']:
tier = 'REVIEWABLE'
emoji = '🟡'
review_time = '15–45 min'
bug_catch_rate = '~70%'
recommendation = 'Acceptable size. Consider splitting if changes are logically independent.'
elif total <= thresholds['oversized']['lines'] and files <= thresholds['oversized']['files']:
tier = 'OVERSIZED'
emoji = '🟠'
review_time = '1–2 hours'
bug_catch_rate = '~50%'
recommendation = 'Split recommended. Reviewers will miss issues at this size.'
else:
tier = 'UNMERGEABLE'
emoji = '🔴'
review_time = '4+ hours (often skipped)'
bug_catch_rate = '~30%'
recommendation = 'Split required. PRs this large rarely get meaningful review.'
return {
'tier': tier,
'emoji': emoji,
'total_lines': total,
'files': files,
'review_time': review_time,
'bug_catch_rate': bug_catch_rate,
'recommendation': recommendation,
}
import subprocess
import re
from collections import defaultdict
# File patterns → logical chunk types
CHUNK_PATTERNS = [
('SCHEMA_MIGRATION', [
r'migrations?/', r'db/migrate/', r'alembic/', r'\.sql$',
r'schema\.prisma$', r'flyway/',
]),
('TESTS', [
r'\.test\.[jt]sx?$', r'\.spec\.[jt]sx?$', r'_test\.py$',
r'test_.*\.py$', r'__tests__/', r'spec/', r'/tests?/',
]),
('DEPENDENCIES', [
r'package\.json$', r'package-lock\.json$', r'yarn\.lock$',
r'pnpm-lock\.yaml$', r'requirements.*\.txt$', r'Pipfile',
r'Cargo\.toml$', r'Cargo\.lock$', r'go\.mod$', r'go\.sum$',
]),
('CONFIG', [
r'\.(yaml|yml|toml|ini|env|conf)$', r'docker', r'Dockerfile',
r'\.github/', r'\.circleci/', r'\.gitlab-ci',
r'webpack\.', r'vite\.', r'tsconfig\.', r'eslint',
]),
('DOCS', [
r'\.md$', r'\.rst$', r'docs?/', r'README', r'CHANGELOG',
r'CONTRIBUTING', r'\.txt$',
]),
('REFACTOR', [
# Heuristic: many renames, moves, or files with high churn but same logic
]),
('FEATURE', []), # Catch-all for business logic
]
def classify_files_to_chunks(pr_number_or_branch: str) -> dict:
"""Group changed files into logical chunks."""
# Get changed files
if pr_number_or_branch.isdigit():
result = subprocess.run(
['gh', 'pr', 'diff', pr_number_or_branch, '--name-only'],
capture_output=True, text=True
)
else:
result = subprocess.run(
['git', 'diff', f'main...{pr_number_or_branch}', '--name-only'],
capture_output=True, text=True
)
files = [f.strip() for f in result.stdout.strip().split('\n') if f.strip()]
chunks = defaultdict(list)
for fpath in files:
assigned = False
for chunk_type, patterns in CHUNK_PATTERNS[:-1]: # all except FEATURE
if any(re.search(p, fpath, re.IGNORECASE) for p in patterns):
chunks[chunk_type].append(fpath)
assigned = True
break
if not assigned:
chunks['FEATURE'].append(fpath)
# Further split FEATURE files by top-level directory (module)
feature_by_module = defaultdict(list)
for fpath in chunks.get('FEATURE', []):
parts = fpath.split('/')
module = parts[0] if len(parts) > 1 else 'root'
feature_by_module[module].append(fpath)
# If multiple modules, split FEATURE by module
if len(feature_by_module) > 1:
del chunks['FEATURE']
for module, module_files in feature_by_module.items():
chunks[f'FEATURE:{module}'] = module_files
return dict(chunks)
def suggest_split_plan(chunks: dict, pr_title: str = '') -> list[dict]:
"""Generate a concrete PR split plan from chunks."""
plan = []
# Priority order for split sequencing
DEPENDENCY_ORDER = [
'SCHEMA_MIGRATION', 'DEPENDENCIES', 'CONFIG',
# FEATURE modules sorted alphabetically
'REFACTOR', 'DOCS', 'TESTS',
]
# Sort chunks by dependency order
def chunk_sort_key(chunk_name):
base = chunk_name.split(':')[0]
try:
return DEPENDENCY_ORDER.index(base)
except ValueError:
return 5 # FEATURE modules go in the middle
sorted_chunks = sorted(chunks.items(), key=lambda x: chunk_sort_key(x[0]))
for i, (chunk_type, files) in enumerate(sorted_chunks):
base_type = chunk_type.split(':')[0]
module = chunk_type.split(':')[1] if ':' in chunk_type else None
# Generate suggested branch name and PR title
if base_type == 'SCHEMA_MIGRATION':
suggested_branch = f'feat/db-migration'
suggested_title = f'[1/{len(sorted_chunks)}] Add database migration'
note = 'Merge first — code changes should be backward-compatible with old schema'
elif base_type == 'DEPENDENCIES':
suggested_branch = f'chore/deps-update'
suggested_title = f'[1/{len(sorted_chunks)}] Update dependencies'
note = 'Can be reviewed and merged independently'
elif base_type == 'CONFIG':
suggested_branch = f'chore/config-updates'
suggested_title = f'[{i+1}/{len(sorted_chunks)}] Update configuration'
note = 'Low risk, easy to review'
elif base_type == 'TESTS':
suggested_branch = f'test/add-tests'
suggested_title = f'[{i+1}/{len(sorted_chunks)}] Add tests'
note = 'Merge after the feature PRs it tests'
elif base_type == 'DOCS':
suggested_branch = f'docs/update-docs'
suggested_title = f'[{i+1}/{len(sorted_chunks)}] Update documentation'
note = 'Can be merged any time'
else:
slug = (module or 'feature').lower().replace('/', '-').replace('_', '-')
suggested_branch = f'feat/{slug}'
suggested_title = f'[{i+1}/{len(sorted_chunks)}] {module or "Feature"}: {pr_title[:40]}'
note = 'Core feature changes'
plan.append({
'order': i + 1,
'chunk_type': chunk_type,
'suggested_branch': suggested_branch,
'suggested_title': suggested_title,
'files': files,
'file_count': len(files),
'note': note,
})
return plan
## PR Size Analysis
PR #1234: "Add user preferences system and refactor auth middleware"
Branch: `feature/user-preferences` → `main`
---
### Size Classification: 🔴 UNMERGEABLE
| Metric | This PR | Atomic | Reviewable | Oversized |
|--------|---------|--------|------------|-----------|
| Lines changed | **1,247** | ≤150 | ≤400 | ≤800 |
| Files changed | **67** | ≤10 | ≤25 | ≤50 |
| Est. review time | **4+ hours** | <15 min | 15–45 min | 1–2 hrs |
| Est. bug catch rate | **~30%** | ~90% | ~70% | ~50% |
**This PR will not receive meaningful review.** Split into atomic units.
---
### Detected Logical Chunks (5 independent units)
SCHEMA_MIGRATION (2 files) db/migrate/20260319_add_preferences_table.rb db/migrate/20260319_add_user_settings_column.rb → Branch: feat/db-migration | Merge FIRST
FEATURE:auth (11 files) src/auth/middleware.ts src/auth/session.ts src/auth/token-validator.ts ... 8 more → Branch: feat/auth-refactor | Independent
FEATURE:preferences (24 files) src/preferences/model.ts src/preferences/api.ts src/preferences/components/PreferencesPage.tsx ... 21 more → Branch: feat/user-preferences | Depends on migration
CONFIG (5 files) .env.example docker-compose.yml src/config/app.ts tsconfig.json .eslintrc.js → Branch: chore/config-updates | Independent, merge any time
TESTS (25 files) src/auth/tests/middleware.test.ts src/preferences/tests/api.test.ts ... 23 more → Branch: test/add-tests | Merge LAST
---
### Suggested Split Plan
PR 1/5 — feat/db-migration (2 files, ~45 lines) Title: "[1/5] Add preferences table migration" Note: Merge first. New columns are nullable — safe for old app code. Depends on: nothing
PR 2/5 — feat/auth-refactor (11 files, ~280 lines) Title: "[2/5] Refactor auth middleware" Note: Pure refactor — no behavior change. Easy to review in isolation. Depends on: nothing
PR 3/5 — feat/user-preferences (24 files, ~580 lines) Title: "[3/5] Add user preferences system" Note: Core feature. Still large — consider splitting by layer (API / components / model). Depends on: PR 1/5 (migration)
PR 4/5 — chore/config-updates (5 files, ~90 lines) Title: "[4/5] Update config for preferences feature" Note: Trivial changes. Can be merged any time. Depends on: nothing
PR 5/5 — test/add-tests (25 files, ~252 lines) Title: "[5/5] Add tests for preferences and auth" Note: Merge after PRs 2 and 3 are in. Depends on: PR 2/5, PR 3/5
---
### How to Execute the Split
```bash
# Step 1: Create the migration branch
git checkout main
git checkout -b feat/db-migration
git checkout feature/user-preferences -- db/migrate/20260319_add_preferences_table.rb
git checkout feature/user-preferences -- db/migrate/20260319_add_user_settings_column.rb
git commit -m "Add preferences table migration"
gh pr create --title "[1/5] Add preferences table migration" --base main
# Step 2: Auth refactor branch
git checkout main
git checkout -b feat/auth-refactor
git checkout feature/user-preferences -- src/auth/
git commit -m "Refactor auth middleware"
gh pr create --title "[2/5] Refactor auth middleware" --base main
# Repeat for each chunk...
⚠️ **This PR is 1,247 lines across 67 files — too large for meaningful review.**
I've identified 5 independent logical units. Suggested split:
| # | Title | Size | Depends On |
|---|-------|------|------------|
| 1 | feat/db-migration — Add migration | ~45 lines | — |
| 2 | feat/auth-refactor — Refactor auth | ~280 lines | — |
| 3 | feat/user-preferences — Core feature | ~580 lines | #1 |
| 4 | chore/config-updates — Config | ~90 lines | — |
| 5 | test/add-tests — Tests | ~252 lines | #2, #3 |
Run `/pr-size 1234` for full split instructions with git commands.
PR #1234 Size: 🔴 UNMERGEABLE (1,247 lines, 67 files)
5 logical chunks detected:
1. SCHEMA_MIGRATION (2 files) — merge first
2. FEATURE:auth (11 files) — independent
3. FEATURE:preferences (24 files) — depends on migration
4. CONFIG (5 files) — independent
5. TESTS (25 files) — merge last
Run /pr-size 1234 --split for git commands to execute the split.
Canlah AI — Run performance marketing without breaking your brand.