Install
openclaw skills install codex-reviewThree-tier code quality defense: L1 quick scan, L2 deep audit (via bug-audit), L3 cross-validation with adversarial testing. 三级代码质量防线。
openclaw skills install codex-reviewUnified orchestration layer: picks audit depth based on trigger phrases. bug-audit is invoked as an independent skill — never modified.
CODEX_REVIEW_API_KEY env var. Never hardcoded, never logged, never stored.CODEX_REVIEW_API_BASE (default: https://api.openai.com/v1), CODEX_REVIEW_API_KEY, CODEX_REVIEW_MODEL (default: gpt-4o)| User says | Level | What it does | Est. time |
|---|---|---|---|
| "review" / "quick scan" / "review下" / "检查下" | L1 | External model scan + agent deep pass | 5-10 min |
| "audit" / "deep audit" / "审计下" / "排查下" | L2 | Full bug-audit flow (or built-in fallback) | 30-60 min |
| "pre-deploy check" / "上线前检查" | L1→L2 | L1 scan → record hotspots → L2 audit → hotspot gap check | 40-70 min |
| "cross-validate" / "highest level" / "交叉验证" | L3 | Dual independent audits + compare + adversarial test | 60-90 min |
read, git clone <url>, server scp, user-pasted snippet, or PR diffcurl -s "${CODEX_REVIEW_API_BASE:-https://api.openai.com/v1}/chat/completions" \
-H "Authorization: Bearer ${CODEX_REVIEW_API_KEY}" \
-H "Content-Type: application/json" \
-d '{
"model": "${CODEX_REVIEW_MODEL:-gpt-4o}",
"messages": [
{"role": "system", "content": "<REVIEW_SYSTEM_PROMPT>"},
{"role": "user", "content": "<code content>"}
],
"temperature": 0.2,
"max_tokens": 6000
}'
Fallback: If API call fails or times out (120s), skip Round 1 and complete with agent-only audit.
You are an expert code reviewer. Find ALL bugs and security issues:
1. CRITICAL — Security vulnerabilities (XSS, injection, auth bypass), crash bugs
2. HIGH — Logic errors, race conditions, unhandled exceptions
3. MEDIUM — Missing validation, edge cases, performance issues
4. LOW — Code style, dead code, minor improvements
For each: Severity, File+line, Issue, Fix with code snippet.
Focus on real bugs, not style opinions. Output language: match the user's language.
Node.js/Express:
Python/Django/Flask:
Frontend (React/Vue/vanilla):
Other stacks: adapt checklist to detected technology.
After L1, write issue summary to ${TMPDIR:-/tmp}/codex-review-hotspots.json:
{
"project": "my-project",
"timestamp": "2026-03-05T22:00:00",
"hotspots": [
{"file": "routes/admin.js", "severity": "CRITICAL", "brief": "Admin auth bypass via localhost"},
{"file": "routes/game.js", "severity": "CRITICAL", "brief": "Score submission no server validation"}
]
}
This file is only used internally for L1→L2 handoff. bug-audit is unaware of it.
Step 1: External model independent audit
→ Full code to external API with detailed system prompt
→ Output: Report A
Step 2: Agent independent audit (bug-audit or fallback)
→ Full bug-audit flow (or built-in fallback)
→ Output: Report B
Step 3: Cross-compare
→ Both found → 🔴 Confirmed high-risk (high confidence)
→ Only external → 🟡 Agent verifies (possible false positive)
→ Only agent → 🟡 External verifies (possible deep logic bug)
→ Contradictory → ⚠️ Deep analysis, provide judgment
Step 4: Adversarial testing
→ Ask external model to bypass discovered fixes
→ Validate fix robustness
You are a security researcher. The following security fixes were applied to a project.
For each fix, analyze:
1. Can the fix be bypassed? How?
2. Does the fix introduce new vulnerabilities?
3. Are there edge cases the fix doesn't cover?
Be adversarial and thorough. Output language: match the user's language.
# 🔍 Code Audit Report — [Project Name]
## Audit Level: L1 / L2 / L3
## 📊 Overview
- Files scanned: X
- Issues found: X (🔴 Critical X | 🟠 High X | 🟡 Medium X | 🔵 Low X)
- [L3 only] Cross-validation: Both agreed X | External only X | Agent only X | Conflict X
## 🔴 Critical Issues
### 1. [Issue Title]
- **File**: `path/to/file.js:42-55`
- **Found by**: External model / Agent / Both
- **Description**: ...
- **Fix**:
(code snippet)
## ✅ Highlights
- [What's done well]
Users can customize behavior by saying: