Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

ExpertPack Eval

Measure ExpertPack EK (Esoteric Knowledge) ratio and run automated quality evals. Use when: (1) Measuring what percentage of a pack's content frontier LLMs c...

MIT-0 · Free to use, modify, and redistribute. No attribution required.
0 · 25 · 0 current installs · 0 all-time installs
byBrian Hearn@brianhearn
MIT-0
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Suspicious
high confidence
!
Purpose & Capability
The skill's functionality (blind-probing models and LLM-as-judge scoring) legitimately requires an OpenRouter API key and network access to OpenRouter; however the registry metadata declares no required environment variables or config paths while the SKILL.md and scripts explicitly require/resolve an OPENROUTER_API_KEY and read OpenClaw auth/config files. The omission in metadata is an incoherence.
!
Instruction Scope
The runtime instructions and bundled scripts perform network calls to openrouter.ai and will probe arbitrary LLM endpoints and user-provided agent endpoints. The scripts also attempt to read local OpenClaw configuration files (~/.openclaw/.../auth-profiles.json and models.json) to auto-resolve API keys. Reading those local config files is not documented in the metadata (config paths list is empty) and touches files that may contain other credentials — this should have been declared and explained.
Install Mechanism
There is no install spec (instruction-only at registry level), so nothing is written to disk by the registry installer. However the package includes Python scripts that import requests, PyYAML, httpx, and websockets; those Python dependencies are not declared in metadata. Users will need to install these packages before running the scripts, which is a practical omission but not itself malicious.
!
Credentials
The code requires an OpenRouter API key (OPENROUTER_API_KEY) and will attempt to auto-resolve it from local OpenClaw auth files. The registry metadata lists no required environment variables or config paths — this is a mismatch. Automatically reading local agent auth files to locate keys increases the chance of accessing other stored credentials; the skill should declare and justify this behavior and prefer explicit env var usage.
Persistence & Privilege
The skill does not request permanent presence (always:false) and does not modify other skills or agent-wide settings. It does not install background services. Autonomous invocation is allowed (platform default) but not combined here with any elevated persistence.
What to consider before installing
This skill appears to do what it says (blind-probe models and run LLM-as-judge evals) but there are important mismatches you should consider before installing or running it: - The scripts require an OpenRouter API key (OPENROUTER_API_KEY) though the registry metadata lists no required env vars. The code will also attempt to read OpenClaw config files (~/.openclaw/agents/main/agent/auth-profiles.json and models.json) to auto-resolve keys — review those files to ensure you’re comfortable with that access. - The Python scripts depend on third-party packages (requests, pyyaml, httpx, websockets) which are not declared in the metadata; install them in a controlled virtualenv before running. - The skill makes outbound network calls to https://openrouter.ai and to any agent endpoints you point it at (ws:// or http(s) endpoints). Do not run it against sensitive internal services without reviewing and sandboxing. Recommendations: - Prefer setting OPENROUTER_API_KEY in the environment rather than relying on auto-detection of OpenClaw config files. - Inspect the two included scripts (scripts/eval-ek.py and scripts/run-eval.py) yourself to confirm there are no unwanted endpoints or behavior (they are short and human-readable). - Run the scripts in an isolated environment (container or VM) and with a scoped API key. If you need absolute assurance, request that the publisher add explicit metadata for required env vars and config paths and declare Python dependencies. Given the clear mismatches between declared metadata and actual behavior (config file access and missing env var declaration), treat this skill as suspicious until those omissions are addressed.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.0.0
Download zip
latestvk977t5kdjxvhbpppyn5ftwwq6n831dq8

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Runtime requirements

Binspython3

SKILL.md

ExpertPack Eval

Measure and evaluate ExpertPack quality. Companion to the core expertpack skill.

Note: This skill makes external API calls to OpenRouter for blind probing and LLM-as-judge scoring. Requires an API key.

1. Measure EK Ratio

Blind-probe frontier models to measure what percentage of a pack's propositions they cannot answer without the pack loaded:

python3 {skill_dir}/scripts/eval-ek.py <pack-path> [--models model1,model2] [--sample N] [--output FILE]
  • Default models: GPT-4.1-mini, Claude Sonnet 4.6, Gemini 2.0 Flash (via OpenRouter)
  • API key: Auto-resolves from OpenClaw auth profiles or OPENROUTER_API_KEY env var
  • Judge model: Claude Sonnet (GPT-4.1-mini is unreliable as judge — defaults to "partial")
  • Output: YAML with per-proposition scores and aggregate ratio

Interpretation:

EK RatioMeaning
0.80+Exceptional — almost entirely esoteric
0.60–0.79Strong — majority esoteric
0.40–0.59Mixed — significant GK padding
0.20–0.39Weak — most content already in weights
< 0.20Minimal value-add

Add measured ratio to manifest.yaml:

ek_ratio:
  value: 0.72
  measured: "2026-03-12"
  models: ["gpt-4.1-mini", "claude-sonnet-4-6", "gemini-2.0-flash"]
  propositions_tested: 142

2. Run Quality Eval

Automated eval against a pack-powered agent endpoint:

python3 {skill_dir}/scripts/run-eval.py \
  --questions <eval-set.yaml> \
  --endpoint <ws://host:port/path> \
  --output <results.yaml> \
  --label "baseline"
  • Build eval set: 30+ questions (basic, intermediate, advanced, out-of-scope)
  • Fix one dimension at a time: structure → agent training → model
  • Re-run after each change to verify improvement

Learn more: expertpack.ai · GitHub

Files

3 total
Select a file
Select a file to preview.

Comments

Loading comments…