# Security Scan

PUBLIC GitHub repo commit body 4-grep secret pattern check. Run before EVERY commit on PUBLIC repos. One match → commit blocked, sanitize required.

## When to use

- Before every commit on a PUBLIC repo (`gh repo view --json isPrivate -q '.isPrivate'` = `false`)
- After paste/dump of infrastructure documentation that may contain credentials
- After `--amend` or `rebase --interactive edit` on a commit that may have included a secret
- After staging files that were generated by a script or tool you did not author (the tool may have included default credentials)

## Core rule: 4-grep pre-commit scan (HARD STOP)

**For PUBLIC repos, run all 4 greps on `git diff --cached`. All four must return zero matches before `git commit`. Even one match → block commit + sanitize + re-scan.**

### Don't / Do

| # | Don't | Do |
|---|-------|-----|
| 1 | Paste a credential (PAT, Vault token, API key) plaintext into infrastructure / operations docs and commit | `gh repo view --json isPrivate` first → if PUBLIC, store credentials in a secret store (Vault `secret/<name>`, AWS Secrets Manager, etc.) → docs reference the secret-store path only |
| 2 | "Internal reference only, plaintext is fine" reasoning | A PUBLIC repo commit, even unpushed, lives in local git history + Syncthing / Time Machine / backup mirrors. Sanitize at write time |
| 3 | Delegate push-time secret scanning to GitHub's server-side check | Server-side is a backstop. Some secret types (Vault tokens, internal API keys) are NOT covered. **Pre-commit self-check is the first line of defense** |
| 4 | Trigger a history rewrite only after secret scanning blocks the push | If your own pre-commit scan finds one plaintext credential in the staged body, immediately amend / rebase-edit to history-rewrite — don't wait for the server to catch it |

## The 4 greps

After `git diff --cached`, all 4 must return zero matches:

```bash
# 1. GitHub tokens (PAT / fine-grained / OAuth / server / refresh)
git diff --cached | grep -E '(ghp_|gho_|ghu_|ghs_|ghr_|github_pat_)[A-Za-z0-9_]{20,}'

# 2. Vault token / Anthropic / OpenAI / Stripe / Slack
git diff --cached | grep -E '(hvs\.[A-Za-z0-9]+|sk-[A-Za-z0-9]{20,}|pk-[A-Za-z0-9]{20,}|xox[baprs]-)'

# 3. API key / token / secret label + plaintext value
git diff --cached | grep -iE '(api[_-]?key|access[_-]?token|_TOKEN|_SECRET|password|passphrase)[[:space:]]*[:=][[:space:]]*[\x27\x22]?[A-Za-z0-9+/=_-]{16,}'

# 4. Base64 32+ chars (encryption key candidate) — label + value co-occurrence
git diff --cached | grep -E '[A-Za-z0-9+/]{32,}=' | grep -iE '(key|secret|token)'
```

## On match — sanitize procedure

1. Replace the offending line with a secret-store path reference (e.g., `Vault \`secret/<name>\` key \`token\``)
2. For secrets that cannot live in a managed store (Vault Unseal Key, Vault Root Token), reference an external password manager by name only
3. Re-stage → re-run all 4 greps → all zero → commit

## Post-commit discovery (already committed)

1. **Rotate the exposed secret immediately** — local-only commits still leak via Syncthing / Time Machine / backup mirrors
2. `git rebase -i <parent>^` + mark the offending commit `edit` → sanitize file → `git commit --amend --no-edit -S` → `git rebase --continue`
3. Push state:
   - Not yet pushed → fast-forward to remote after sanitize
   - Already pushed → force-push gate applies (see `github-flow/push-guards.md` "Force push CI status check" guidance)

## Exceptions

- **PRIVATE repos**: this rule does NOT apply (separate company-info / private-secrets policies still apply)
- **Explicit `[secret-allowed]` marker** in the commit message — intentional dummy / example / test fixture

## Self-check (PUBLIC repo, every commit)

1. `gh repo view --json isPrivate -q '.isPrivate'` — confirm `false` (or recall from earlier in session)
2. Run all 4 greps on `git diff --cached`
3. Any single match → block commit + sanitize + return to step 2
4. All four return zero → proceed to `git commit`

## Failure pattern

See the user-local `~/.claude/skills/cleanup/data/failed-attempts.md` HOT entry "PUBLIC repo commit body 4-secret-type exposure" (this file is external to the repo — not checked into version control). An infrastructure markdown file slipped 4 different credential types into a commit (GitHub PAT + Vault Unseal Key + Vault Root Token + Plane API Token). GitHub's server-side scanner caught only 1 of the 4 on push. The pre-commit 4-grep self-check would have caught all 4.

## Escalation

Cumulative violations → install the PreToolUse:Bash hook `block-secret-in-commit.sh` (registered via `~/.claude/settings.json`). The hook runs the same 4 greps against `git diff --cached` whenever a `git commit` invocation is detected and rejects the commit if any grep matches.

## Related topics

- `staging-discipline` — runs BEFORE this scan (staged set must be intentional before secret scanning)
- `message-discipline` — commit message PUBLIC English enforcement runs AFTER this scan passes
