Install
openclaw skills install @merlinrabens/clawcheckPerforms a two-phase audit combining a fast deterministic scan and a deep LLM quality review of security, cron jobs, config, and skills.
openclaw skills install @merlinrabens/clawcheckTwo-phase audit: a fast deterministic scan catches structural issues, then you (the agent) do a deep quality evaluation on the flagged areas.
openclaw doctor says "ok"| This skill | openclaw doctor (built-in) |
|---|---|
| Secrets exposure + token hygiene | Config JSON schema validation |
| Cron ops health + prompt quality review | Plugin/skill eligibility |
| Config optimization + value assessment | Channel connectivity |
| Skill structural + content quality audit | State migrations, browser detection |
Run the script to get a structural baseline:
python3 {baseDir}/scripts/audit.py
Individual modules:
python3 {baseDir}/scripts/audit.py --security
python3 {baseDir}/scripts/audit.py --cron
python3 {baseDir}/scripts/audit.py --config
python3 {baseDir}/scripts/audit.py --skills
This produces JSON with scores, findings, and the bottom/top skill lists. Use this as your triage map for Phase 2.
After running the script, perform these evaluations. Budget your depth based on what the user asked for ("quick check" = Phase 1 only, "full audit" or "quality review" = both phases).
Read ~/.openclaw/openclaw.json and evaluate:
agents.defaults.heartbeat.prompt. Is it specific enough to catch real issues? Does it avoid heavy operations? A good heartbeat prompt is < 200 words, checks 2-3 things, and has clear escalation criteria.reserveTokens and keepRecentTokens reasonable for the context window size? Rule of thumb: reserve should be 15-20% of contextTokens.pruneAfter, maxEntries, rotateBytes set to values that match the usage pattern? Heavy cron usage needs more aggressive pruning.*/ in their schedule expression.Score each aspect 1-5. Report specific improvements.
Read ~/.openclaw/cron/jobs.json. Select the 5 most important enabled jobs using this heuristic:
frequency x timeoutSeconds (most resource-consuming)For each selected job evaluate:
payload.timeoutSeconds set and reasonable for scope?Score each job 1-5 on: purpose, prompt quality, safety, efficiency. Flag jobs scoring below 3.
Cross-reference: Check if any cron prompts reference skills that scored below 70 in Phase 1. A cron job is only as reliable as the skills it depends on.
From the Phase 1 results, pick:
bottom_5)For each selected skill, read its full SKILL.md and evaluate:
Scoring formula depends on skill type:
(accuracy*2 + completeness*1.5 + clarity + efficiency + voice) / 6.5(accuracy*2 + completeness*1.5 + clarity + efficiency) / 5.5For skills scoring below 4.0, write specific improvement recommendations with concrete examples.
Phase 1 now scans workspace files for common secret patterns (sk-, ghp_, AIzaSy, Bearer tokens, hex private keys, etc.). In Phase 2, go deeper:
scripts/ contain hardcoded credentials or API URLs with embedded tokens.env files exist inside skill directories{
"score": 82,
"score_type": "structural_hygiene",
"status": "healthy",
"sections": {
"security": {"score": 65, "finding_count": 3},
"cron": {"score": 95, "finding_count": 1},
"config": {"score": 88, "finding_count": 2},
"skills": {"score": 80, "finding_count": 1}
},
"findings": [...]
}
Present as a readable report to the user:
## ClawCheck Report
### Structural Baseline (Phase 1)
Overall: 82/100 (healthy)
Security: 65 | Cron: 95 | Config: 88 | Skills: 80
### Deep Quality Findings (Phase 2)
**Config:**
- Heartbeat prompt: 4/5 (clear but could add Telegram alert on critical)
- Model choices: 5/5 (opus primary, sonnet fallback, sonnet subagent)
- Compaction: 4/5 (reserveTokens=150k for 800k context = 19%, good)
**Cron (top concerns):**
- "Morning Brief" (3/5): prompt is 400 words but lacks output format spec
- "Bleeding Edge Scanner" (2/5): no safety guardrails, no error handling
**Skills (bottom 3):**
- marketing-automation: BROKEN (no SKILL.md)
- apple-notes (62/100 structural): [content evaluation]
- blucli (62/100 structural): [content evaluation]
### Recommended Actions (priority order)
1. [most impactful fix]
2. [next fix]
3. [next fix]
Security 30%, cron 25%, config 20%, skills 25%.
Skill structure formula: (structure*2 + completeness*1.5 + clarity + efficiency) / 5.5 * 20
For detailed fix patterns with real config examples, see {baseDir}/references/remediation.md.
Quick fixes for common findings:
"GAMMA_API_KEY": {"source": "exec", "provider": "op-gamma", "id": "value"}
"botToken": {"source": "exec", "provider": "op-telegram", "id": "value"}
"heartbeat": {"every": "1h", "model": "sonnet", "prompt": "HEARTBEAT: Quick check..."}
"schedule": {"kind": "cron", "expr": "0 9 * * *", "tz": "Europe/Madrid"}
openclaw.json is missing or invalid: script exits with error JSON.openclaw doctor schema validation or channel connectivity checks