Axioma Skill Evaluator Strict EN
PassAudited by ClawScan on May 10, 2026.
Overview
This appears to be a local skill-quality evaluator, but users should treat its 90% approval score as advisory and verify the local commands before running it.
Before installing, verify the complete scripts in your environment, run evaluations only on intended skill directories, be cautious with --improve, and treat the 90% threshold as a heuristic quality gate rather than a guarantee that a skill is safe or production-ready.
Findings (3)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
Running the skill executes local Python code against a chosen skill directory and may affect that directory if the improvement mode writes changes.
The documented workflow runs local evaluator code and includes an --improve mode. This is aligned with the evaluator purpose, but users should understand whether --improve edits files before using it.
python3 .../evaluator.py <skill-path> --verbose --improve
Run it only on a skill path you intentionally select, review or back up files before using --improve, and avoid broad modes unless you understand their scope.
The tool may fail, scan an unexpected directory, or write reports to an unexpected local path if those hard-coded paths are used.
The evaluator contains hard-coded, user-specific local paths. This is not malicious by itself, but it is a portability and scope issue for users outside that environment.
SKILL_DIR = Path("/media/ezekiel/Morgana/skills")
REPORTS_DIR = Path("/media/ezekiel/Morgana/skills/SKILL_EVALUATOR/reports")Prefer explicit skill paths, inspect the full script before broad runs, and adjust hard-coded paths for your own environment.
A user or agent could over-trust a numeric score and treat it as final approval.
The skill uses strong approval and production-readiness language for a deterministic score. That can be useful as a quality gate, but it should not be mistaken for a full security or human review.
IF score >= 90%: → APPROVED ✅ — Ready for production ... NO PUBLISH until 90%+ achieved
Use the score as one input, not as a replacement for security review, functional testing, and human approval.
