PluginEval Core

v1.0.0

Self-contained PluginEval quality evaluation engine. Measures 6 dimensions, detects anti-patterns, assigns badges. No external dependencies.

⭐ 0· 89·0 current·0 all-time

by@donmeusi

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for donmeusi/plugineval-core.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "PluginEval Core" (donmeusi/plugineval-core) from ClawHub.
Skill page: https://clawhub.ai/donmeusi/plugineval-core
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install plugineval-core

ClawHub CLI

Package manager switcher

npx clawhub@latest install plugineval-core

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Benign

medium confidence

✓

Purpose & Capability

Name/description claim a self-contained quality evaluator and the repository contains an evaluation script and references that match that purpose. The skill declares no external env vars, binaries, or installs, which is proportionate to its stated function.

✓

Instruction Scope

SKILL.md instructs running the included Python evaluator against a skill directory; it documents read-only modes (--layer1, --anti-patterns) and an explicit --allow-write flag for modifications. The instructions do not direct reading of unrelated system files or exfiltration to external endpoints.

✓

Install Mechanism

No install spec (instruction-only) and included code uses only standard library imports shown. This is low-risk: nothing is downloaded from arbitrary URLs or installed automatically.

✓

Credentials

No required environment variables, credentials, or config paths are declared or used in the visible code. That aligns with the skill's stated static-analysis purpose.

✓

Persistence & Privilege

Skill is not always-included and has normal autonomy defaults. File-modification capabilities exist but require an explicit --allow-write flag; backups are created under the target skill directory. The skill does not request system-wide changes or other skills' credentials.

Assessment

This skill appears coherent and self-contained. Before running it: (1) run in read-only modes first (e.g., --layer1, --anti-patterns, or --auto-fix without --allow-write) to inspect outputs; (2) review the full scripts/eval.py to confirm there are no hidden network calls or LLM API invocations in the portions not shown (Layer 2 mentions an LLM judge — verify it doesn't require external API keys or make outbound requests); (3) run the provided tests in a sandbox; and (4) don't use --allow-write until you've inspected the auto-fix code and are comfortable with changes and backups (the script creates backups in the same skill directory). If you want higher assurance, ask for the remainder of eval.py (the truncated section) to confirm there are no unexpected network or credential accesses.

Like a lobster shell, security has layers — review code before you run it.

badgesvk977e6yet5v9nntghgv6nem74984exn5evaluationvk977e6yet5v9nntghgv6nem74984exn5latestvk977e6yet5v9nntghgv6nem74984exn5qualityvk977e6yet5v9nntghgv6nem74984exn5static-analysisvk977e6yet5v9nntghgv6nem74984exn5

89downloads

0stars

1versions

Updated 2w ago

v1.0.0

MIT-0

PluginEval Core 🔬

Self-contained quality evaluation for AI agent skills. Measures quality across 6 dimensions, detects anti-patterns, assigns quality badges.

Use When

Evaluating skill quality before installation
Checking installed skills for quality issues
Improving skills to meet quality standards
Publishing skills to ClawHub with quality badges

Input / Output

Input:

Skill directory containing SKILL.md
Optional: --layer1, --layer2, --anti-patterns flags

Output:

{
  "skill": "example-skill",
  "score": 87,
  "badge": "Gold",
  "grade": "B+",
  "anti_patterns": []
}

Usage

# Layer 1: Static Analysis
python3 ~/.openclaw/skills/plugineval-core/scripts/eval.py --layer1 <skill-dir>

# Anti-Pattern Detection
python3 ~/.openclaw/skills/plugineval-core/scripts/eval.py --anti-patterns <skill-dir>

# Full Evaluation
python3 ~/.openclaw/skills/plugineval-core/scripts/eval.py <skill-dir>

Quality Dimensions

Dimension	Weight	Measures
Frontmatter Quality	35%	Name, description, trigger
Orchestration Wiring	25%	Input/Output, examples
Progressive Disclosure	15%	Conciseness
Structural Completeness	10%	Headings, troubleshooting
Token Efficiency	6%	Directives, duplication
Ecosystem Coherence	2%	Cross-references

Quality Badges

Badge	Score
Platinum ★★★★★	≥90
Gold ★★★★	≥80
Silver ★★★	≥70
Bronze ★★	≥60
Needs Improvement ★	<60

Anti-Patterns

Pattern	Penalty
OVER_CONSTRAINED	10%
EMPTY_DESCRIPTION	10-50%
MISSING_TRIGGER	15%
BLOATED_SKILL	10%
ORPHAN_REFERENCE	5%
DEAD_CROSS_REF	5%

References

Examples

# Evaluate skill
python3 scripts/eval.py --layer1 ~/.openclaw/skills/weather-pollen

# Output:
# [1/6] Frontmatter Quality: 100/100
# [2/6] Orchestration Wiring: 100/100
# ...
# Final: 87 | Badge: Gold ★★★★

Version: 1.0.0 | License: MIT

Comments

Loading comments...