Skill Test

Test skills before using or publishing. Trial, compare, evaluate in isolation without affecting your environment.

MIT-0 · Free to use, modify, and redistribute. No attribution required.
2 · 1.5k · 12 current installs · 14 all-time installs
byIván@ivangdavila
MIT-0
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description match the contents: the SKILL.md and auxiliary docs describe workflows for testing, comparing, and evaluating other skills. There are no unrelated env vars, binaries, or install steps requested.
Instruction Scope
Instructions focus on spawning isolated sub-agents and reviewing SKILL.md and auxiliary files. This stays in-scope, but the doc advises loading 'skill content' into sub-agents and optionally asking for test credentials — that is expected for a tester but could expose secrets if the tester or platform isn't careful.
Install Mechanism
No install spec or code files are provided, so nothing is written to disk by the skill itself. The docs mention using npx clawhub install as a testing convenience; that is an external CLI operation under the user's control, not part of the skill payload.
Credentials
The skill requests no environment variables or credentials. It does advise asking the user for test credentials if a skill requires them — this is reasonable for testing, but users should prefer ephemeral/test-only credentials and avoid pasting production secrets into a test run.
Persistence & Privilege
Flags are default (always: false, model invocation allowed). The skill does not request persistent presence or attempt to modify other skills or system-wide settings.
Assessment
This skill is a tester/checklist and appears coherent — it doesn't request credentials or install code itself. Before using: (1) verify your platform truly isolates sub-agents (so tests can't access your real files/env), (2) never paste production credentials into a test; use ephemeral/test accounts, (3) inspect auxiliary files (compare.md, evaluate.md, sandbox.md) yourself before running automated tests, and (4) if you run commands like 'npx clawhub install', ensure you trust the source of that tool. If you need higher assurance, run tests in a separate VM or container and remove the test folder after completion.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.0.0
Download zip
latestvk979wvzmarja98dwkwz0j9yags80y867

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

SKILL.md

Test Skills Safely

Two use cases:

  1. Try before commit — Test drive skills before installing
  2. Evaluate before publish — Verify quality before publishing

Key principle: Test in isolation. Never affect user's environment.

References:

  • Read sandbox.md — Isolated testing environment
  • Read compare.md — A/B comparison between skills
  • Read evaluate.md — Multi-agent quality evaluation

Quick Start

Trial a skill:

sessions_spawn(
  task="Test skill X: Load ONLY its SKILL.md, run [sample task], report quality",
  model="anthropic/claude-haiku"
)

Compare two skills:

  1. Run same task through each (separate sub-agents)
  2. Present outputs side-by-side
  3. Ask: "Which works better? Why?"

Test Modes

Trial Mode — Before installing

  • Spawn sub-agent with ONLY the test skill
  • Run 2-3 representative tasks
  • Evaluate: Does it help? Clear instructions?
  • Decision: keep, pass, or try another

Evaluation Mode — Before publishing

  • Spawn specialized reviewers (see evaluate.md)
  • Check structure, safety, usefulness
  • Synthesize findings
  • Recommend improvements

Sandbox Isolation

⚠️ Never load test skill into your main context.

Sub-agent approach (recommended):

sessions_spawn(
  task="You have ONE skill loaded: [skill content]. Test by doing: [task]",
  model="anthropic/claude-haiku"
)
  • Complete isolation — main session unaffected
  • Natural cleanup — sub-agent terminates, done
  • Cheap testing — use Haiku

What to check:

  • Does it activate correctly?
  • Are instructions clear?
  • Token cost reasonable?
  • Output quality acceptable?

Edge Cases

Skill requires credentials: Ask user for test credentials or skip auth-dependent features.

Skill not found: Verify slug with npx clawhub info <slug> before testing.

Test fails mid-way: Sub-agent terminates cleanly. Review logs, adjust test task, retry.

Skill has many auxiliary files: Load SKILL.md first, reference others only if needed during test.


Test thoroughly. Install only after explicit user approval.

Files

4 total
Select a file
Select a file to preview.

Comments

Loading comments…