Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

Improvement Evaluator

v1.1.1

当需要验证 Skill 改进是否真正提升了 AI 执行效果时使用。通过预定义任务集(YAML)运行 AI 任务,判定 pass/fail,输出 execution_pass_rate。不用于文档结构评分(用 improvement-learner)或候选打分(用 improvement-discriminator)。

0· 70·1 current·1 all-time
by_silhouette@lanyasheng
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Suspicious
high confidence
Purpose & Capability
The skill's name and description align with the included code: it runs task suites, invokes an LLM client, and applies multiple judge types. However the package fails to declare a key runtime dependency: the code requires the 'claude' CLI (used for LLM-driven evaluation) but the registry metadata lists no required binaries or environment. That mismatch (missing declared dependency) is a coherence issue.
!
Instruction Scope
Runtime instructions and code prepend SKILL.md text to prompts and invoke the external 'claude' CLI, and the PytestJudge runs pytest on packaged test files. Pytest is executed with the process environment inherited (os.environ) and AI output passed via AI_OUTPUT_FILE. That means test code executed by the evaluator runs as a subprocess with access to the agent's environment and file system; malicious or poorly written tests could read environment secrets or perform I/O. The skill does include path-traversal guards (test_file must start with 'fixtures/' and resolution is constrained to tests/fixtures), but it does not limit what a fixture test can do once executed.
Install Mechanism
There is no install spec (instruction-only in the registry), so nothing is automatically downloaded or executed during install. The package includes Python scripts and tests that will run at runtime; that is fine but means runtime behavior depends on local environment (presence of 'claude' CLI).
!
Credentials
The skill declares no required env vars or credentials, yet at runtime it inherits and forwards the full process environment to subprocesses (pytest invocation merges os.environ into the child env). Because tests and judges run as subprocesses, any environment variables (including secrets present in the agent environment) will be visible to them. In addition, the code calls external 'claude' CLI (unless run with --mock), which may itself use credentials/config from the environment or local config files. The skill does not request or document these needs in metadata.
Persistence & Privilege
The skill does not request permanent/always-on inclusion, does not modify other skills' configs, and has no install-time hooks. always:false and normal autonomous invocation are used. No elevated persistence privileges are requested.
Scan Findings in Context
[subprocess_run_claude] expected: The skill invokes the external 'claude' CLI to evaluate outputs and to act as an LLM judge; this is expected for an execution-based evaluator but should have been declared as a required binary.
[subprocess_run_pytest] expected: The PytestJudge launches pytest to validate structured outputs. Running tests is expected, but executing test code in a child process gives that test code access to environment variables and the filesystem.
[uses_tempfile_and_writes_ai_output] expected: Temporary files are used to pass AI output into pytest; this is normal for testing but means any test code that reads AI_OUTPUT_FILE or the tempdir can access AI outputs.
[forwards_os_environ_to_subprocess] unexpected: The evaluator forwards the full os.environ to the pytest subprocess. While convenient, it can expose credentials in the parent environment to test code; a more conservative design would pass a minimal sanitized env.
What to consider before installing
This skill appears to implement the described evaluator, but it has a few things to check before use: - Dependency: The code expects the 'claude' CLI for real LLM evaluations (scripts call 'claude -p'). The registry metadata did not declare this required binary. If you don't have 'claude' installed or don't want to use it, run with --mock to avoid external CLI calls. - Review packaged tests/fixtures: The PytestJudge runs pytest on test files under the skill's tests/fixtures directory. Those tests run as subprocesses with your agent's environment (os.environ) available to them. Inspect any fixture test code before running evaluations to make sure it doesn't read or exfiltrate environment variables, read local files you care about, or perform unexpected network calls. - Secrets exposure: Because child processes inherit environment variables, any secrets present in your agent/process environment could be visible to tests. Consider running evaluations in an isolated environment (no sensitive env vars), or use --mock mode when developing or when you cannot guarantee test file safety. - If you intend to run this in production, ask the skill author to (a) declare 'claude' as a required binary in metadata, and (b) minimize env propagation to subprocesses or explicitly document which env vars are required. Prefer running with a dedicated service account and a sanitized environment. Overall: the package is functional and coherent with its purpose, but the missing declared dependency and the potential for test code to access inherited environment variables are real operational/security concerns. Use caution, inspect fixtures, and prefer --mock for exploratory runs.

Like a lobster shell, security has layers — review code before you run it.

latestvk978gcw745pw9dds0ngc1e13gd84c0w4

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Comments