Install
openclaw skills install skill-quality-checkQuality audit for AI Agent Skills. Use before installing or after writing any SKILL.md. Scores 5 dimensions with actionable improvements. Works for skills written for Claude, Cursor, Codex, and any AI agent. Keywords: audit, skill, quality, review, score, assess, best practices, vet.
openclaw skills install skill-quality-checkUniversal quality assessment framework for AI Agent Skills. Evaluates any SKILL.md file across 5 dimensions, outputting a quantified score and actionable improvement suggestions. Designed to work with skills built for Claude, Cursor, Codex, OpenClaw, or any AI agent.
Find the SKILL.md file:
# Path priority (in order):
1. User-specified path
2. <skills-dir>/<skill-name>/SKILL.md
# Common locations by platform:
# OpenClaw: ~/.openclaw/skills/<skill-name>/SKILL.md
# QClaw: ~/.qclaw/skills/<skill-name>/SKILL.md
# Claude Code: ~/.claude/skills/<skill-name>/SKILL.md
# Cursor: ~/.cursor/skills/<skill-name>/SKILL.md
# Codex: ~/.codex/skills/<skill-name>/SKILL.md
3. <repo>/skills/<skill-name>/SKILL.md
4. <repo>/<skill-name>/SKILL.md
# If installing from GitHub without a local copy, fetch via curl:
curl -s "https://raw.githubusercontent.com/<owner>/<repo>/main/skills/<skill>/SKILL.md"
Then scan the directory for supporting files:
skill-name/
├── SKILL.md ✅ required
├── scripts/ ✅ optional (lazy-loaded)
├── references/ ✅ optional (lazy-loaded)
└── assets/ ✅ optional (lazy-loaded)
SKILL.md must have YAML frontmatter with only these fields:
---
name: <skill-name> ✅ required
description: > ✅ required
# Fields below are NOT recommended in frontmatter:
# ❌ version → package metadata
# ❌ author → non-standard
# ❌ license → non-essential
# ❌ compatibility → most Skills don't need it
# ❌ tags → non-standard
---
Review checklist:
name and description exist?description under 150 characters (trigger-level content must be concise)?description include trigger keywords ("when to use")?Description is Level 1 content — the AI uses it to decide whether to trigger the Skill. It is a trigger, not a manual.
✅ Good Description:
TDD test-driven development workflow. Use when writing new features,
adding tests, or debugging. Keywords: test-driven, TDD, red-green-refactor.
❌ Bad Description:
This is a comprehensive guide to Test-Driven Development using the
red-green-refactor cycle. First, write a failing test that describes
the behavior you want. Then write the minimum code to make it pass...
(Too long — contains Level 2 content that belongs in SKILL.md body)
Scoring rubric (each dimension 0-10):
| # | Dimension | Question |
|---|---|---|
| 1 | Trigger Accuracy | Does it clearly state when to use this Skill? |
| 2 | Conciseness | Under 150 chars? No explanatory filler? |
| 3 | Keyword Coverage | Does it include trigger keywords (e.g. TDD, debug, pdf)? |
| 4 | Non-Redundancy | Does it avoid restating what AI already knows? |
Five assessment dimensions (0-10 each):
Does it follow the three-layer loading principle?
| Layer | Content | When Loaded |
|---|---|---|
| Level 1 | name + description | Always in context |
| Level 2 | SKILL.md body | On skill trigger |
| Level 3 | scripts/ + references/ + assets/ | On execution, never in context |
Review checklist:
Does the Skill open with a clear role or context definition?
✅ Good example:
# PDF Processing Skill
You are a professional document preparation assistant specializing in
PDF creation and editing workflows...
Are there sufficient, relevant, and diverse examples?
Claude recommends 3-5 examples that are:
Review checklist:
Are instructions clear, actionable, and unambiguous?
Review checklist:
Are bundled resources used appropriately?
| Resource | When to Use | Review Question |
|---|---|---|
| scripts/ | Deterministic/repeated code execution | Is there repetitive code that should be a script? |
| references/ | Detailed docs, API specs, domain knowledge | Is there >10k chars of docs not in references/? |
| assets/ | Templates, images, fonts for output | Are there files that should be assets, not inline content? |
Review checklist:
Formula:
Level 1 cost ≈ len(description) / 4 tokens
(English: ~4 chars ≈ 1 token)
Benchmarks:
Review checklist:
High-risk signals:
Aggregate all dimension scores into the final report.
SKILL AUDIT REPORT
═══════════════════════════════════════════════════════════════
Skill: [skill-name]
Source: [local path / GitHub URL / ClawHub]
Audited: [date]
───────────────────────────────────────────────────────────────
I. YAML FRONTMATTER COMPLIANCE [X/10]
✅ [passed items]
❌ [issues]
II. DESCRIPTION QUALITY [X/40]
Trigger Accuracy [X/10]
Conciseness [X/10]
Keyword Coverage [X/10]
Non-Redundancy [X/10]
III. BODY QUALITY [X/40]
Progressive Disclosure [X/10]
Role Setting [X/10]
Examples [X/10]
Instruction Clarity [X/10]
IV. RESOURCE LAYERING [X/10]
scripts/ Usage [X/5]
references/ Usage [X/5]
V. PERFORMANCE IMPACT [-5 to +2]
Level 1 Cost [penalty/bonus]
Level 2 Volume [penalty/bonus]
Mis-trigger Risk [penalty/bonus]
───────────────────────────────────────────────────────────────
OVERALL SCORE: X / 100
───────────────────────────────────────────────────────────────
Grade:
🟢 Excellent (85-100) — Worth installing, top quality
🟡 Good (70-84) — Usable, has room for improvement
🔴 Acceptable (50-69) — Usable but needs optimization
⚫ Poor (<50) — Not recommended
───────────────────────────────────────────────────────────────
VI. IMPROVEMENT RECOMMENDATIONS (priority order)
🔴 P0 (must fix):
- [specific issue and fix]
🟡 P1 (strongly recommended):
- [specific issue and fix]
🟢 P2 (optional):
- [nice-to-have improvements]
═══════════════════════════════════════════════════════════════
| Score | Grade | Meaning | Action |
|---|---|---|---|
| 85-100 | 🟢 Excellent | Meets all best practices | Install directly |
| 70-84 | 🟡 Good | Meets most standards, minor issues | Install, address P1 items |
| 50-69 | 🔴 Acceptable | Functional but有明显缺陷 | Fork and fix, or wait for update |
| <50 | ⚫ Poor | Fails best practices | Do not install, find alternatives |
| Symptom | Cause | Fix |
|---|---|---|
| Description too long | Frontmatter >150 tokens | Move details to body, keep only trigger keywords |
| Body too long | SKILL.md >500 lines | Split into references/ |
| No examples | Text-only instructions | Add 3-5 XML-wrapped example pairs |
| Vague role | No clear Skill boundary | Add role-setting paragraph |
| AI-common-knowledge filler | Explaining what AI already knows | Delete, keep only project-specific context |
| Not layered | Docs in body | Move to references/ |
| Mis-triggers | Overlapping or vague keywords | Differentiate Descriptions |
| Dimension | Skill Vetter | Skill Quality Check | | Goal | Security review | Quality review | | Core question | Will this Skill harm me? | Is this Skill well-written? | | Focus | Malicious code, permission abuse | Writing standards, performance | | When | Before any install | When assessing quality | | Output | Security report | Quality score + recommendations |
Use both in sequence: Vet for safety first, then audit for quality.
# Fetch SKILL.md from GitHub
curl -s "https://raw.githubusercontent.com/<owner>/<repo>/main/skills/<skill>/SKILL.md"
# Check frontmatter
grep -A 5 "^---" SKILL.md | head -10
# Estimate Level 2 volume (lines → ~10 tokens/line)
wc -l SKILL.md
Every audit report must include:
Do not say "this Skill is pretty good" — deliver a specific score, specific issues, and specific fixes.
Good Skills deserve thorough auditing. Bad Skills deserve honest feedback. 🔍🦀
Input:
name: tdd-skill
description: >
TDD test-driven development workflow. Use when writing new features,
adding tests, or fixing bugs. Keywords: test-driven, TDD, red-green-refactor,
pytest, unit test.
Audit Result:
Input:
name: tdd-skill
description: >
This is a comprehensive guide to Test-Driven Development using
the red-green-refactor cycle. First, you write a failing test that
describes the behavior you want. Then write the minimum code to make
it pass. Then refactor while keeping tests green. This approach
ensures high test coverage and better code quality...
Audit Result:
P0 Recommendation:
Rewrite Description to be under 150 chars. Move the cycle explanation to SKILL.md body.
Input:
# PDF Processing Skill
You are a professional document preparation assistant specializing in
PDF creation, editing, and conversion workflows. You have deep knowledge
of PDF structure, reportlab, pypdf, and weasyprint.
Audit Result:
Minor improvement (P2): Could add one sentence about what this Skill does NOT cover (e.g. OCR, scanned PDFs).
Input:
# My Skill
This skill helps you get things done. Use it when you need help.
It provides instructions and guidelines for various tasks.
Audit Result:
P0 Recommendation:
Replace generic language with specific domain context. Define what the Skill does and does not cover.
Directory structure:
awesome-skill/
├── SKILL.md 80 lines (Level 2: execution flow only)
├── references/
│ ├── api-spec.md 450 lines (Level 3: detailed API docs)
│ └── troubleshooting.md 120 lines (Level 3: edge cases)
└── scripts/
└── validate.sh (Level 3: deterministic execution)
Audit Result:
Minor improvement (P2): Could add a brief Layer 1 summary in Description listing which references/ files are most relevant.
Symptom: SKILL.md has 620 lines including a 300-line API reference pasted directly in the body.
Audit Result:
P0 Recommendation:
Move the API reference to
references/api-spec.md. SKILL.md body should be execution flow only (under 500 lines).
Scenario: User has 12 Skills installed. Two of them have "debug" in their Description:
| Skill | Description trigger keyword |
|---|---|
| systematic-debugging | "debugging, error, bug" |
| general-helper | "debug, logs, errors, general assistance" |
Audit Result:
P1 Recommendation:
Differentiate: systematic-debugging should use "systematic-debugging, root-cause" (more specific); general-helper should remove "debug" entirely or move it lower in priority.