Install
openclaw skills install hollis-code-auditUse this skill only when the user explicitly asks for a code audit, security audit, risk-focused PR/diff review, repo or module audit, regression-risk review, project-intent drift check, or validation of another reviewer/agent's findings. Do not use for routine implementation, ordinary debugging, refactoring, test writing, docs proofreading, architecture brainstorming, frontend design review, or general "review" requests unless the user asks for audit, risk, security, regression, or evidence-backed findings. The skill reads README/AGENTS/docs first, inspects code/diffs/tests, and uses a user-specified or strongest available non-development reviewer model/subagent when policy allows; if none is available, it discloses that same-model audits can share blind spots.
openclaw skills install hollis-code-auditRun a read-only, evidence-backed audit unless the user explicitly asks for fixes.
This skill must work for Codex, Claude-style agents, local CLI agents, subagents, and agents with limited tools.
<skill-dir> as the directory containing this SKILL.md. In this repo it is .codex/skills/code-audit.This skill may be used to audit code produced by another agent or alongside another agent's audit skill. Keep the review artifact-focused and avoid tool/agent rivalry.
code-audit-packet.md, so prior agent artifacts are not overwritten.Honor an explicit user mode. If none is provided, use standard.
quick: current diff/status only; report high-risk findings fast.standard: intent docs + requested scope + adjacent contracts/tests.security: prioritize auth, permissions, paths, secrets, LLM/network, logs.deep: broader repo risk map, independent review, and test strategy.intent: focus on whether the change violates README/AGENTS/product purpose.Optional config may live at <skill-dir>/config.json or a user-specified path. Supported keys are advisory, not required:
{
"default_mode": "standard",
"preferred_reviewers": ["gemini", "kimi", "ollama"],
"forbidden_external_review": false,
"skip_paths": ["data/", "dist/", "*.db", "*.pdf"],
"always_read": ["AGENTS.md", "README.md"],
"report_language": "match_user"
}
Honor user routing first. If the user specifies a reviewer model, tool, or subagent, use that route when it is available and allowed by repo policy/confidentiality. If it is unavailable or unsafe, say why and choose the closest permitted fallback.
Snapshot cheaply. From the repo root, run the snapshot helper when Python is available:
python <skill-dir>/scripts/audit_snapshot.py --root . --json
Use python3 instead of python when that is the local convention.
Use it to identify intent docs, changed files, dependency/test files, and high-risk paths without loading the whole repo.
Build a packet when useful. For external reviewers or subagents, generate a compact packet rather than manually pasting broad context:
python <skill-dir>/scripts/build_audit_packet.py --root . --mode standard --scope diff
Add --include-diff only after checking that the diff is safe to share.
When auditing another agent's work, add --producer-agent "<name>" and optionally --prior-review <path>.
Read intent sources first. Load only the relevant AGENTS.md, README.md, selected docs/, security/architecture notes, and test/dependency config. Derive 3-5 audit principles before inspecting implementation details.
Inspect by risk. Review the requested diff/files/modules first, then adjacent contracts, tests, and high-risk call paths.
Independent review. Use a non-development model/subagent when available and permitted. Validate its findings locally before reporting them.
Report findings first. Prioritize concrete bugs, security risks, regressions, missing tests, and contract breaks. Avoid style-only findings unless requested.
If the repo has .auditignore, helper scripts use it to skip generated, confidential, or noisy paths. When the snapshot is noisy, suggest adding .auditignore entries instead of loading more context.
Use the user's specified reviewer model/subagent if provided. User preference beats the default "best available" selection, unless it violates project rules or data-safety constraints.
If the user did not specify a reviewer, inventory available routes without exposing secrets:
python <skill-dir>/scripts/detect_review_models.py --current-model "<development-model-if-known>"
For custom providers such as Hermes, DeepSeek, OpenRouter, or local routing layers, copy review_routes.example.json to <skill-dir>/review_routes.json or pass --config <path>.
Selection rules:
When using an external model/subagent, load references/independent-reviewer-prompt.md and adapt it.
If independent review is unavailable or blocked, include this exact status in the report:
Independent model review: not performed - <reason>. Because same-model audits can share blind spots, I recommend a second pass by a strong non-current-model reviewer or a human reviewer before relying on this audit for high-stakes decisions.
Create repo-specific principles from project docs before findings. Keep them short and operational:
If docs are missing or contradictory, say so and infer cautiously from code/tests.
For detailed prompts, load references/audit-checklist.md only when needed. At minimum, consider:
**Scope**
Reviewed <scope>. Intent sources: <files>. Verification: <commands or "not run">.
**Audit Principles**
- <principle tied to project purpose>
- <principle tied to trust boundary>
- <principle tied to user request>
**Independent Review**
<model/subagent used, not-used reason, or limitation disclosure>
**Findings**
- [P0/P1/P2/P3] <title> - <file:line>
Impact: <realistic failure mode>
Evidence: <code path, repro, test result, or reasoning>
Recommendation: <specific fix or mitigation>
Test gap: <missing or recommended test>
**Residual Risk**
<areas not covered, tests not run, uncertainty, or recommended second pass>
Severity:
P0: exploitable critical security issue, data loss/corruption, or outage likely.P1: serious correctness/security/permission flaw with plausible real impact.P2: meaningful bug, edge-case regression, missing validation, or test gap.P3: maintainability, clarity, minor UX, or low-risk hardening.If no issues are found, say that clearly and still disclose test gaps and residual risk.
Before answering:
When changing this skill, run the lightweight script tests and eval schema check if Python/pytest are available:
python <skill-dir>/scripts/run_tests.py
python <skill-dir>/scripts/run_evals.py