Back to skill
Skillv1.0.0
ClawScan security
Llm Eval Harness · ClawHub's context-aware review of the artifact, metadata, and declared behavior.
Scanner verdict
BenignApr 30, 2026, 12:42 AM
- Verdict
- benign
- Confidence
- high
- Model
- gpt-5-mini
- Summary
- The skill's instructions, scope, and requirements are consistent with an LLM evaluation harness; it asks for no unexpected credentials or installs, but operational risks (executing generated code and using external judge models) require caution when you run it.
- Guidance
- This skill appears to do what it claims, but take these precautions before using it: - Only evaluate non-sensitive or scrubbed data (model outputs can contain secrets and judge models may see them). - If you enable the 'code execution' evaluation, run generated code inside a controlled sandbox with resource limits and network restrictions. - Decide which judge model(s) you'll use and supply API keys securely; avoid giving evaluator access to keys that allow broad admin actions. - Set and document thresholds (e.g., embedding similarity) and multiple judges for contentious cases to reduce single-model bias. - Review generated reports for leaked content before sharing externally. If you want, provide details about your runtime environment (which LLM APIs, whether you have a sandbox) and I can point out exactly what credentials/permissions to grant and how to lock them down.
Review Dimensions
- Purpose & Capability
- okThe name/description match the instructions: dataset management, multi-dimension scoring, model comparison, regression detection, and report generation are all coherent with an 'LLM Eval Harness'. It does not request unrelated binaries, env vars, or config paths.
- Instruction Scope
- noteInstructions stay within evaluation tasks, but they call for executing generated code and using a stronger model as a judge. Those are reasonable for an eval harness but carry operational risks (running untrusted code, potential data exposure to judge models). The SKILL.md does not instruct reading local system files or hidden credentials.
- Install Mechanism
- okThis is instruction-only with no install spec or downloads — lowest install risk.
- Credentials
- noteThe skill declares no required env vars or credentials, which is consistent with a generic instruction-only harness. In practice it implicitly expects access to LLM APIs and possibly an execution/sandbox environment; those runtime credentials and permissions are not specified here and should be provisioned thoughtfully.
- Persistence & Privilege
- okalways:false and no install actions or persistent system modifications are requested. The skill does not request elevated or permanent platform privileges.
