Skill flagged — suspicious patterns detected
ClawHub Security flagged this skill as suspicious. Review the scan results before using.
modelshow
v1.0.1Blind multi-model comparison with architecturally guaranteed de-anonymization. Trigger with "mdls" or "modelshow" for double-blind evaluation of AI model res...
⭐ 1· 420·2 current·2 all-time
bySky Sloane@schbz
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
OpenClaw
Suspicious
medium confidencePurpose & Capability
The name/description (double-blind multi-model evaluation) matches the code and instructions: scripts anonymize responses, call a judge, de-anonymize outputs, and save results. The details (config.json listing model aliases, judge model, timeouts, outputDir) are consistent with that purpose. The skill reads OpenClaw agent config (~/.openclaw/openclaw.json) to resolve model aliases and writes results to a user-writable output directory — this is expected for producing human-friendly output, but it does mean the skill touches user config and home-directory storage which is beyond pure in-memory evaluation.
Instruction Scope
SKILL.md instructs the orchestrator to: fetch external content referenced by prompts (URLs, files, preferences) and prepend it to model tasks, read and write config.json under the skill baseDir, spawn judge sub-agents and instructs the judge to run local commands (piping JSON into judge_pipeline.py). Those instructions permit reading user files and fetching external URLs — operations that go beyond simply sending prompts to models and could surface private data. Also the documentation and scripts claim an 'architectural guarantee' that the orchestrator never sees placeholder labels, but the anonymize operation (judge_pipeline.py/blind_judge_manager.py) returns an explicit anonymization_map/reverse_map to the caller, which would allow the orchestrator to deanonymize. That is a functional contradiction between the stated guarantee and the actual code/instructions.
Install Mechanism
No remote install steps or downloads are present in the skill bundle (instruction-only install spec, but with local Python scripts included). There are no brew/npm downloads or network-install commands. The code shipped in the skill will be installed locally when the skill is added; no remote code is fetched at install time.
Credentials
The skill declares no required environment variables or external credentials, which is appropriate. However, runtime behavior reads ~/.openclaw/openclaw.json (to resolve model aliases) and writes results to a home directory outputDir by default. Those file accesses are proportionate to the described features (alias resolution, saving reports), but they are also access to user configuration and filesystem that users should be aware of. Importantly, the pipeline returns anonymization_map data to the orchestrator during the anonymize phase, undermining the claimed 'orchestrator never sees placeholders' guarantee — that exposes mappings that can deanonymize results if retained.
Persistence & Privilege
The skill does not request always:true or other elevated installation privileges. It writes results to an outputDir under the user's home by design, and offers an optional utility to copy JSON/MD to a web directory. Those behaviors involve filesystem persistence but are within the scope of expected functionality (saving reports). There is no code that modifies other skills or system-wide agent settings.
What to consider before installing
The skill appears to implement the described multi-model blind-judging workflow, but take these precautions before installing or running it: 1) Understand the anonymity tradeoff: despite wording claiming the orchestrator never sees placeholders, the anonymize step returns an anonymization_map (placeholder→model) to the orchestrator — keep that map secret or modify the workflow if you require stronger guarantees. 2) Review and edit config.json: remove any model aliases you don't want queried and set outputDir to a safe location. 3) Be aware the skill will read your OpenClaw config (~/.openclaw/openclaw.json) to resolve model aliases and will write reports to your home directory; if that file contains sensitive info, inspect save_results.py and blind_judge scripts first. 4) External-content behavior: the SKILL.md says the agent will fetch referenced URLs or files and prepend them to prompts — if you want to avoid exposing local or remote data, disable this behavior or ensure you only run the skill on non-sensitive prompts. 5) If you require the stronger property that nobody except the judge can deanonymize, either remove the return of anonymization_map from the anonymize phase or ensure anonymization and finalize are executed inside a trusted atomic environment that never exposes the map to the orchestrator. 6) If you are unsure, run the skill on harmless test prompts first and manually inspect the outputs and saved files. If you want, I can point out the precise lines to change to avoid returning the anonymization map to the orchestrator or to prevent reading ~/.openclaw/openclaw.json.Like a lobster shell, security has layers — review code before you run it.
latestvk97c89y4h1t37dmgsmqqsn4qax82hfpy
License
MIT-0
Free to use, modify, and redistribute. No attribution required.
Runtime requirements
🕶️ Clawdis
