Repo Guardian — Dual-Model PR Review & Issue Triage
Automated repository maintenance with cross-model review consensus.
Scope & Boundaries
Repo Guardian handles PR review governance and repo maintenance automation:
reviewing PRs, enforcing quality via dual-model consensus, auto-merging when
approved, and triaging repository state.
It is not the issue-to-fix implementation pipeline. If the job is to fetch
issues, spawn coding agents, implement fixes, open PRs, and monitor review
feedback, use gh-issues instead.
It is also not a general-purpose GitHub CLI toolkit. For direct gh CLI
operations such as listing PRs, commenting, checking CI, or making ad hoc API
queries, use the github skill.
NOT For
- Implementing issue fixes end-to-end — fetching issues, spawning coding agents, writing code, and opening PRs belongs to the gh-issues skill
- General GitHub CLI operations — listing PRs, commenting, checking CI, or ad-hoc
gh queries belong to the github skill
- Code authoring or refactoring — Repo Guardian reviews and gates merges; it does not write new code
What It Does
Every 6 hours (configurable), Repo Guardian:
- Checks for open PRs on the target repo
- Reviews each PR with two independent models (Opus + GPT-5.4)
- Merges if both models approve
- Requests changes if either model finds issues
- Optionally prepares follow-up remediation for review-discovered issues
- Checks for open issues and triages them for the appropriate next step
Cron Setup
# Run the guardian script via OpenClaw cron
# Add to ~/.openclaw/cron/jobs.json:
{
"repo-guardian": {
"schedule": "0 */6 * * *",
"agent": "<your-agent-name>",
"message": "Run repo-guardian for your-org/your-repo",
"skill": "repo-guardian"
}
}
Or run manually:
bash <skill_dir>/scripts/guardian.sh your-org/your-repo
Review Process
PR Review (Dual-Model Consensus)
Open PR detected
│
├─→ Opus reviews (security, architecture, correctness)
├─→ Sonnet reviews (code quality, edge cases, tests)
│ (fallback: Haiku if Sonnet unavailable)
│
├─ Both APPROVE → auto-merge (squash)
├─ One APPROVE, one REQUEST_CHANGES → post review comments, do not merge
├─ Both REQUEST_CHANGES → post review comments, do not merge
└─ Either finds CRITICAL issue → post comments + label "needs-fix"
Issue Triage
Open issue detected
│
├─ Assess complexity and routing (ready for automation vs needs human)
├─ Ready for implementation: hand off to the issue-fix pipeline (gh-issues)
└─ Complex or unclear: add label "needs-human", post analysis comment
Review Criteria
Each model evaluates independently against:
- Correctness — Does the code do what the PR claims?
- Security — Any vulnerabilities, secret exposure, injection risks?
- Tests — Are changes tested? Do existing tests still pass?
- Scope — Does the PR stay within its stated purpose?
- Quality — Code style, error handling, edge cases, naming
Each model returns a structured verdict:
{
"verdict": "APPROVE|REQUEST_CHANGES|CRITICAL",
"summary": "One-line summary",
"findings": [
{"severity": "critical|major|minor", "file": "...", "line": 0, "issue": "...", "fix": "..."}
],
"confidence": "high|medium|low"
}
Configuration
Environment variables (set in shell or .env):
GH_TOKEN — GitHub token with repo access (required)
GUARDIAN_AGENT — OpenClaw agent name for Reviewer A (default: $OPENCLAW_AGENT or default)
GUARDIAN_REVIEWER_B_AGENT — OpenClaw agent name for Reviewer B (default: same as GUARDIAN_AGENT; set to a different agent for true cross-model review)
GUARDIAN_REPO — Default repo (e.g., your-org/your-repo)
GUARDIAN_AUTO_MERGE — Enable auto-merge on consensus (true/false, default: true)
GUARDIAN_AUTO_FIX — Enable auto-fix for issues (true/false, default: false)
GUARDIAN_MAX_PRS — Max PRs to review per run (default: 5)
GUARDIAN_MAX_ISSUES — Max issues to process per run (default: 3)
Data Flow & Privacy
Repo Guardian sends PR diffs and file listings to the configured OpenClaw agent models for review. This means:
- Repository code from open PRs is transmitted to your configured AI model providers (e.g., Anthropic, OpenAI) via the OpenClaw agent interface
- No data is sent to any third-party endpoint beyond your configured model providers
- Large diffs are truncated to 500 lines before transmission to limit exposure
- The
GH_TOKEN is used only for GitHub API calls and is never passed to AI model prompts
- All JSON payloads to the GitHub API are constructed via Python
json.dumps() to prevent injection
Recommended GH_TOKEN scopes: repo (read) for review-only mode; add repo (write) only if auto-merge is enabled. Use a fine-grained token scoped to the specific repository when possible.
Safety
- Never force-pushes or modifies protected branches
- Squash merges only — clean history
- Labels PRs with review status for audit trail
- Posts review comments with model attribution (which model said what)
- Requires dual consensus — single model cannot merge alone
- Skips PRs by org members marked with
skip-guardian label
- Dry-run mode available (
--dry-run flag)
- Credentials (GH_TOKEN) are user-configured via environment variables; Repo Guardian never stores, bundles, or transmits tokens
- Auto-merge requires explicit opt-in (GUARDIAN_AUTO_MERGE=true); disabled by default
- All review actions are logged with model attribution for full audit trail
- The skill operates only on the repository specified by the user; it does not discover or access other repos
Run Trace Logging
Each Repo Guardian run emits a structured trace for audit and performance tracking. After completing all PR reviews and issue triages, write a trace entry:
### [YYYY-MM-DD HH:MM] repo-guardian run
- **Repo:** [owner/repo]
- **PRs reviewed:** [count] (merged: N, changes requested: N, critical: N)
- **Issues triaged:** [count] (routed to gh-issues: N, labeled needs-human: N)
- **Model agreement rate:** [% of PRs where both models reached same verdict]
- **Duration:** [approx time]
- **Anomalies:** [any unexpected behavior, timeouts, model disagreements worth noting]
Write traces to memory/performance/skill-runs.md (following the standard skill execution logging protocol). The model agreement rate is a key health metric — if it drops below 70% over 5+ runs, the review criteria may need recalibration.
Models Used
| Role | Primary | Fallback |
|---|
| Reviewer A | anthropic/claude-opus-4-6 | anthropic/claude-sonnet-4-6 |
| Reviewer B | anthropic/claude-sonnet-4-6 | anthropic/claude-haiku-4-5 |
| Issue triage | anthropic/claude-sonnet-4-6 | anthropic/claude-haiku-4-5 |
Note: GPT-5.4 (openai-codex/gpt-5.4) can be used as Reviewer B if the OpenAI Codex agent is configured and available in your deployment. When using GPT, set Reviewer B primary to openai-codex/gpt-5.4 with fallback anthropic/claude-sonnet-4-6.
Requirements
Required environment variables:
GH_TOKEN — A GitHub Personal Access Token. Must be set explicitly (the script will not fall back to gh auth token to avoid inadvertent scope leakage). Use a fine-grained PAT scoped to the specific target repository with read/write permissions for pull requests and issues.
Required binaries (must be on PATH):
openclaw — OpenClaw CLI (dispatches review prompts to configured model agents)
python3 — JSON construction and data parsing
curl — GitHub API calls
Required OpenClaw configuration:
- At least one agent configured with access to Opus-tier and Sonnet/GPT-tier models for dual-model review