AI Intelligence Hub - Real-time Model Capability Tracking

PassAudited by VirusTotal on May 11, 2026.

Overview

Type: OpenClaw Skill Name: model-benchmarks Version: 1.0.0 The OpenClaw AgentSkills skill bundle 'model-benchmarks' is classified as benign. The code and documentation consistently align with its stated purpose of tracking AI model capabilities and optimizing costs. While the `scripts/run.py` file imports `urllib.request` and defines external API URLs, the current implementation of data fetching functions (`fetch_lmsys_arena`, `fetch_bigcode_leaderboard`, `fetch_current_prices`) explicitly uses `mock_data` and does not make actual external network requests. File system operations are limited to the skill's own directory or legitimate OpenClaw internal configuration paths (`~/.openclaw/workspace/skills/compute-router/dynamic_config.json`). The markdown files (`SKILL.md`, `README.md`, `examples/integration-examples.md`) contain clear instructions and examples for users, including shell commands and Python snippets, but these do not contain any prompt injection attempts, unauthorized commands, or instructions for data exfiltration. The `curl` command in an example is for user-configured Slack alerts, not malicious exfiltration. No obfuscation or persistence mechanisms are present within the skill's core logic.

Findings (0)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

What this means

Users may trust recommendations as current leaderboard intelligence when they are based on hardcoded sample data, potentially changing model routing or spending based on inaccurate information.

Why it was flagged

The main data-fetching implementation says the real HuggingFace parsing is still TODO and returns mock data, while the skill is described as real-time benchmark tracking.

Skill content
# TODO: 实际实现需要解析 HuggingFace Space 的数据 ... # 这里先提供模拟数据 ... mock_data = {
Recommendation

Label the skill as using sample/offline data until real fetching is implemented, include verifiable source timestamps, and avoid unsupported cost-saving claims.

ConcernMedium Confidence
ASI02: Tool Misuse and Exploitation
What this means

Future OpenClaw runs could use an unintended, lower-quality, or invalid model setting until the user notices and reverts it.

Why it was flagged

The example automatically changes the default OpenClaw model based on command output, without a confirmation, validation, or rollback step.

Skill content
EFFICIENT_MODEL=$(python3 skills/model-benchmarks/scripts/run.py recommend --task general --sort efficiency | head -1) ... openclaw config set agents.defaults.model.primary "$EFFICIENT_MODEL"
Recommendation

Require explicit approval before changing global model configuration, parse structured JSON output, validate the chosen model, show old/new values, and document rollback commands.

NoteHigh Confidence
ASI10: Rogue Agents
What this means

If installed as a cron job, the skill will keep running on schedule and writing logs until the user removes it.

Why it was flagged

The skill suggests user-created scheduled execution for daily updates; it is disclosed, but it is persistent automation.

Skill content
# Add this to your crontab to automatically optimize model selection ... python3 "$SKILL_DIR/scripts/run.py" fetch >> "$LOG_FILE" 2>&1
Recommendation

Only add the cron job if you want recurring execution, review the schedule, and set up log rotation or removal instructions.

What this means

It is harder to verify provenance or expected runtime requirements before running the included script.

Why it was flagged

The skill includes runnable code but lacks a declared source repository, homepage, install spec, or required Python binary declaration.

Skill content
Source: unknown; Homepage: none; No install spec — this is an instruction-only skill; Code file presence: scripts/run.py
Recommendation

Publish a source repository/homepage, declare Python as a runtime requirement, and document exactly which optional tools such as jq, curl, or bash are needed for examples.

What this means

Model cost or routing information may be sent to Slack, and a leaked webhook URL could allow unauthorized posting to that Slack channel.

Why it was flagged

An optional example posts model cost-change alerts to a Slack webhook using a user-provided webhook URL.

Skill content
curl -X POST -H 'Content-type: application/json' --data "{\"text\":\"🚨 AI Model Cost Alert: $COST_CHANGES\"}" "$SLACK_WEBHOOK_URL"
Recommendation

Store webhook URLs securely, limit what is included in alerts, and avoid posting sensitive usage or spend details unless the Slack workspace is appropriate for that data.