MetriLLM

Find the best local LLM for your machine. Tests speed, quality and RAM fit, then tells you if a model is worth running on your hardware.

MIT-0 · Free to use, modify, and redistribute. No attribution required.

⭐ 0 · 211 · 0 current installs · 0 all-time installs

by@TheBlueHouse75

MIT-0

Security Scan

VirusTotal

Suspicious

View report →

OpenClaw

Benign

high confidence

✓

Purpose & Capability

The name/description match the instructions: it tells you how to install the metrillm CLI, requires Node 20+ and a local LLM server (Ollama or LM Studio), and runs benchmarking commands. No unrelated credentials, binaries, or config paths are requested.

ℹ

Instruction Scope

Instructions stay within the benchmarking scope (run metrillm bench, view local ~/.metrillm/results/). One caution: the optional --share command uploads results (model name, scores, hardware specs) to metrillm.dev; the SKILL.md states no personal data is sent, but that claim cannot be verified from instructions alone. The skill does not instruct access to unrelated files or env vars.

ℹ

Install Mechanism

Installation is via npm (npm install -g metrillm) which is a standard delivery for a Node CLI. npm installs are moderate-risk because they execute third-party code on your system; this is proportionate to the stated purpose but you should inspect the package or source repository before global installation.

✓

Credentials

No environment variables, credentials, or config paths are required. The only data potentially exported is from the explicit --share action (model, scores, hardware specs), which is reasonable for a community leaderboard.

✓

Persistence & Privilege

The skill is not always-enabled and is user-invocable. It does not request persistent elevated privileges or modify other skills. Autonomous invocation is permitted by default (normal), but nothing in the skill attempts to gain extra persistence.

Assessment

This skill is coherent for benchmarking local LLMs, but take two precautions before installing: (1) review the npm package / GitHub repo (https://github.com/MetriLLM/metrillm) or audit the package contents before running npm install -g, since global npm installs execute third-party code on your machine; (2) only use --share if you consent to publishing model names, scores and hardware details (the README says no personal data is sent, but verify what the package actually uploads). Also ensure you have Node 20+ and run Ollama or LM Studio locally as instructed.

Like a lobster shell, security has layers — review code before you run it.

Current versionv0.2.11

Download zip

latestvk97f9yq4m2ek6caayshejwn2ms82a5ey

License

MIT-0

Free to use, modify, and redistribute. No attribution required.

Termshttps://spdx.org/licenses/MIT-0.html

SKILL.md

MetriLLM — Find the Best LLM for Your Hardware

Test any local model and get a clear verdict: is it worth running on your machine?

Prerequisites

Node.js 20+ — check with node -v
Ollama or LM Studio installed and running
- Ollama: ollama.com, then ollama serve
- LM Studio: lmstudio.ai, load a model and start the server
MetriLLM CLI — install globally:

npm install -g metrillm

Usage

List available models

ollama list

Run a full benchmark

metrillm bench --model $ARGUMENTS --json

This measures:

Performance: tokens/second, time to first token, memory usage
Quality: reasoning, math, coding, instruction following, structured output, multilingual
Fitness verdict: EXCELLENT / GOOD / MARGINAL / NOT RECOMMENDED

Performance-only benchmark (faster)

metrillm bench --model $ARGUMENTS --perf-only --json

Skips quality evaluation — measures speed and memory only.

View previous results

ls ~/.metrillm/results/

Read any JSON file to see full benchmark details.

Share to the public leaderboard

metrillm bench --model $ARGUMENTS --share

Uploads your result to the MetriLLM community leaderboard — an open, community-driven ranking of local LLM performance across real hardware. Compare your results with others and help the community find the best models for every setup. Shared data includes: model name, scores, hardware specs (CPU, RAM, GPU). No personal data is sent.

Interpreting Results

Verdict	Score	Meaning
EXCELLENT	>= 80	Fast and accurate — great fit
GOOD	>= 60	Solid — suitable for most tasks
MARGINAL	>= 40	Usable but with tradeoffs
NOT RECOMMENDED	< 40	Too slow or inaccurate

Key metrics to highlight:

tokensPerSecond > 30 = good for interactive use
ttft < 500ms = responsive
memoryUsedGB vs available RAM = will it fit?

Tips

Use --perf-only for quick tests
Close GPU-intensive apps before benchmarking
Benchmark duration varies depending on model speed and response length

Open Source

MetriLLM is free and open source (Apache 2.0). Contributions, issues, and feedback are welcome: github.com/MetriLLM/metrillm

Files

1 total

Select a file

Select a file to preview.

Comments

Loading comments…