Skill flagged — suspicious patterns detected
ClawHub Security flagged this skill as suspicious. Review the scan results before using.
Eval Driven Development
v0.1.11Add instrumentation, build golden datasets, write eval-based tests, run them, root-cause failures, and iterate — Ensure your Python LLM application works cor...
⭐ 0· 205·1 current·1 all-time
by@yiouli
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
OpenClaw
Benign
high confidencePurpose & Capability
Name/description (eval-driven development for Python LLM apps) matches the actual behavior: reading code, instrumenting, building datasets, running tests, and optionally iterating. The included check_version helper and pixie API reference are relevant to the stated purpose.
Instruction Scope
SKILL.md instructs the agent to read and edit project files, run package manager commands (uv/poetry/pip), create a pixie_qa/ directory and run tests. This is broad (file edits and package upgrades) but appropriate for a tool that 'sets up evals' — the skill also documents hard gates (stop for missing API keys) and requires confirmation before applying fixes.
Install Mechanism
There is no install spec (instruction-only). The only code file (check_version.py) fetches a SKILL.md from raw.githubusercontent.com to compare versions — a reasonable version-checking behavior. No downloads from obscure or shortened URLs or archive extraction are present.
Credentials
The skill declares no required environment variables or credentials. The SKILL.md explicitly notes that certain evaluators need LLM API keys (e.g., OPENAI_API_KEY) and instructs the agent to stop and request them rather than guessing. There are no unrelated credential requests.
Persistence & Privilege
always:false and no automatic installation are appropriate. The skill will modify the user's project (create pixie_qa/, add tests, run installs) when invoked — this is expected but requires explicit user consent and care (it may change lockfiles or installed packages).
Assessment
This skill appears to do what it says: it will read and edit your Python project, attempt to upgrade the 'pixie-qa' package, create a pixie_qa/ directory with datasets/tests, and run tests. Before installing or running it: (1) run it in a development environment or branch (not production), (2) back up or commit your repo so changes to files/lockfiles can be reviewed, (3) expect it to require network access for pip/poetry and possibly LLM API keys (it will stop and ask if keys like OPENAI_API_KEY are missing), and (4) review any proposed code edits before allowing iterative fixes — the skill's workflow requires explicit confirmation before making fixes beyond setup. The included version check fetches SKILL.md from raw.githubusercontent.com, which is standard for GitHub-hosted version checks.Like a lobster shell, security has layers — review code before you run it.
evalvk97bp0aw37w3z24q1gvt5bprmn82sysglatestvk97cj37dhn3e36799vr15d6v6x83ez2hqavk97bp0aw37w3z24q1gvt5bprmn82sysg
License
MIT-0
Free to use, modify, and redistribute. No attribution required.
