Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

LLM Evaluator Pro

v1.0.0

LLM-as-a-Judge evaluator via Langfuse. Scores traces on relevance, accuracy, hallucination, and helpfulness using GPT-5-nano as judge. Supports single trace...

1· 671·1 current·1 all-time
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
VirusTotalVirusTotal
Suspicious
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
Name/description match the code: it uses OpenRouter (GPT judge) and Langfuse to score traces. Requesting OPENROUTER_API_KEY and Langfuse keys is consistent with the described function. However the code contains hardcoded Langfuse keys and host values, which undermines the declared requirement model (the skill claims to require env vars but will fall back to embedded credentials).
!
Instruction Scope
SKILL.md instructs running the included Python script. The script, however, attempts to read ~/.openclaw/workspace/.env for the OpenRouter key (a config path not declared in metadata) and uses hardcoded Langfuse credentials/host to call the Langfuse API. Reading an undeclared workspace .env can access other secrets; always-posting scores to a hardcoded Langfuse endpoint (with embedded keys) could transmit data to an unexpected/third-party account.
Install Mechanism
There is no install spec. The skill includes a Python script but does not declare Python package dependencies (requests, openai, langfuse). That is a coherence/usability issue (script may fail). Lack of an install step lowers installation auditability, but is not itself malicious — still increases risk because it's unclear what packages will be installed by users to run it.
!
Credentials
Declared env vars (OPENROUTER_API_KEY, LANGFUSE_PUBLIC_KEY, LANGFUSE_SECRET_KEY) are appropriate for the stated purpose. However the script: (1) sets default LANGFUSE keys in code, (2) hardcodes LF_AUTH and LF_API values rather than reading the environment, and (3) attempts to parse ~/.openclaw/workspace/.env if OPENROUTER_API_KEY is not set. These behaviors mean the skill can use embedded credentials and read an undeclared local .env file, which is disproportionate and suspicious.
Persistence & Privilege
The skill is not force-included (always=false) and does not request persistent platform privileges. It does not attempt to modify other skills or global agent configuration. Autonomy is enabled by default but is not an additional red flag here.
What to consider before installing
This skill largely does what its README says, but there are several red flags you should resolve before running it in a production environment: 1) The script contains hardcoded Langfuse API keys and a hardcoded Langfuse host and uses those values directly — that could send your trace data (or allow the script to act using somebody else's account). Treat those embedded keys as suspicious and do not rely on them. 2) The script will attempt to read ~/.openclaw/workspace/.env for an OPENROUTER_API_KEY if you don't set one in the environment; that file may contain unrelated secrets. The skill metadata did not declare that config path. 3) Dependencies (requests, openai, langfuse) are not declared; running without knowing what will be installed is fragile. Recommended actions before installing/using: - Inspect the evaluator.py file fully (remove or rotate any embedded keys). - Replace hardcoded LF_AUTH/LF_API with explicit env-based configuration and ensure the host points to a Langfuse instance you control. - Avoid running the script as-is on systems with sensitive ~/.openclaw/workspace/.env files; run it in an isolated test environment or container first. - If you need to trust this skill, ask the publisher to provide a version that reads credentials only from declared env vars (no defaults), documents required Python packages, and documents exactly which endpoints will receive data. If the publisher confirms the embedded keys are inert placeholders and the code is changed to respect environment values only, the concerns would be reduced.

Like a lobster shell, security has layers — review code before you run it.

evaluationvk9760sebw2sa2007zts5bnm45s8178hglatestvk9760sebw2sa2007zts5bnm45s8178hgqualityvk9760sebw2sa2007zts5bnm45s8178hg

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Runtime requirements

Binspython3
EnvOPENROUTER_API_KEY, LANGFUSE_PUBLIC_KEY, LANGFUSE_SECRET_KEY

Comments