Model Cost Advisor

v1.0.0

Analyze any task and recommend the most cost-effective LLM — with live pricing data from 30+ models, tier analysis, token estimation, and projected cost. Per...

0· 51·0 current·0 all-time
byMaya Tao@minirr890112-byte

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for minirr890112-byte/model-cost-advisor.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "Model Cost Advisor" (minirr890112-byte/model-cost-advisor) from ClawHub.
Skill page: https://clawhub.ai/minirr890112-byte/model-cost-advisor
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install model-cost-advisor

ClawHub CLI

Package manager switcher

npx clawhub@latest install model-cost-advisor
Security Scan
Capability signals
CryptoCan make purchases
These labels describe what authority the skill may exercise. They are separate from suspicious or malicious moderation verdicts.
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description match the implementation: the two scripts fetch pricing from litellm's JSON and perform task analysis, token estimation, and cost computation. There are no unrelated credentials, binaries, or services requested.
Instruction Scope
SKILL.md instructs only to run fetch_pricing.py and advise.py with a task string; the runtime instructions do not direct the agent to read or exfiltrate unrelated files or environment variables. The only optional integration is comparing against HERMES_CURRENT_MODEL (an expected, optional env var).
Install Mechanism
No install spec; scripts are instruction-only. fetch_pricing.py downloads a JSON from raw.githubusercontent.com (a well-known release host) — reasonable for live pricing. It writes a cache file to ~/.hermes/model_pricing.json, which is expected behavior but worth noting because it creates files in the user's home directory.
Credentials
The skill declares no required environment variables or credentials. The only env referenced in docs is HERMES_CURRENT_MODEL for optional comparison; no secrets (API keys, tokens) are requested or used by the provided code.
Persistence & Privilege
always:false (normal). The skill persists a cache to ~/.hermes/model_pricing.json and creates the directory if needed — reasonable for caching but it does create a persistent file in the home directory. It does not alter other skills or system-wide agent settings.
Assessment
This skill appears to do what it claims: it downloads a community-maintained pricing JSON from GitHub and caches it in ~/.hermes, then analyzes task text locally to recommend models. Before installing/run: (1) verify you trust the litellm source (the script fetches raw.githubusercontent.com/BerriAI/litellm/...); (2) review the cached file (~/.hermes/model_pricing.json) if you want to inspect the data before use; (3) if you prefer no persistent files, run the scripts in a disposable environment or delete the ~/.hermes directory after use; (4) be aware it may optionally read HERMES_CURRENT_MODEL for comparison — set that only if you intend to share your current model. No credentials are required by the skill.

Like a lobster shell, security has layers — review code before you run it.

latestvk970sjff7xe2ct2dxgs3jdme5x85j5b5
51downloads
0stars
1versions
Updated 2d ago
v1.0.0
MIT-0

🤖 Model Cost Advisor

Pick the most cost-effective LLM for any task — before you start spending.

Why pay Claude Opus prices for a task DeepSeek can handle? This skill analyzes your task, maps it to a capability tier, and finds the cheapest model that gets the job done well.


Quick Start

# 1. Fetch live pricing (one-time, auto-cached for 48h)
python scripts/fetch_pricing.py

# 2. Get a recommendation
echo "Write a REST API with FastAPI, handle auth and rate limiting" | python scripts/advise.py

# 3. Or pass task directly
python scripts/advise.py --task "Refactor a 2000-line Python class into smaller modules"

# 4. Compare all models side-by-side
python scripts/advise.py --compare

# 5. JSON output for scripting
python scripts/advise.py --task "Debug a race condition" --json

What It Does

  1. Analyzes your task description for complexity signals (reasoning depth, code needs, context length, agentic loops, domain expertise)
  2. Maps to one of 4 capability tiers: Budget → Standard → Advanced → Premium
  3. Estimates token usage based on task complexity
  4. Scores 30+ models using live pricing from litellm's community DB
  5. Recommends the top 3 models with projected cost, rationale, and pitfalls

The Four Tiers

TierWhen to UseExample TasksTypical Cost
💰 BudgetSimple Q&A, classification, formatting, basic scripts"Summarize this text", "Format JSON"<$0.01
📦 StandardMulti-step reasoning, medium code, structured output"Write a web scraper", "Explain a concept"$0.01–$0.10
🚀 AdvancedComplex code, architecture design, agentic loops"Build a full-stack app", "Debug concurrency"$0.10–$1.00
👑 PremiumFrontier reasoning, research, >128K context"Research paper analysis", "Safety-critical code"$1.00+

Models Tracked

30+ models across 6 providers, updated from litellm's community DB:

ProviderModels
AnthropicClaude Opus 4 / 4.1 / 4.5 / 4.6 / 4.7, Sonnet 4 / 4.5 / 4.6, Haiku 3.5
OpenAIGPT-4o, GPT-4o-mini, GPT-4.1 / 4.1-mini / 4.1-nano, o3 / o3-mini / o4-mini
GoogleGemini 2.0 Flash, 2.5 Flash / Pro
DeepSeekV3 / V3.1 / V3.2, R1 (with reasoning token warning)
AlibabaQwen Turbo / Plus / Max / Coder-Plus / 3-235B
MistralMinistral 3B / 8B / 14B

Example Output

╔══════════════════════════════════════════════════╗
║        🤖 Model Cost Advisor                      ║
╚══════════════════════════════════════════════════╝

🎯 Task Analysis
   Complexity Tier: 3 (Advanced)
   Est. Input:  ~24K tokens
   Est. Output: ~10K tokens
   Signals: multi_step_logic, complex_code, multi_turn_tools

💰 Top Recommendations
   Rank  Model                  Cost     Input $/M Output $/M
   ───── ────────────────────── ────────  ──────── ─────────
   🥇    deepseek-v3            $0.0175     0.28     0.42
   🥈    deepseek-v3.1          $0.0216     0.27     1.10
   🥉    gemini-2.5-flash       $0.0322     0.30     2.50

📋 Why deepseek-v3?
   Tier 3 task → best value in tier 1
   Estimated total cost: $0.0175

How the Agent Uses This Skill

When loaded by Hermes, the agent follows these steps:

Step 1: Analyze Task Requirements

Classify the task along these dimensions to determine the minimum capability tier needed:

DimensionWeightWhat to Assess
Reasoning DepthHighSimple lookup → multi-step logic → deep chain-of-thought
Code GenerationMediumNone → simple scripts → multi-file complex → architecture design
Context LengthMedium<4K → 4K-32K → 32K-128K → >128K tokens
Tool Use / AgenticHighSingle shot → multi-turn tools → autonomous agent loop
Domain ExpertiseLowGeneral → specialized (math, legal, medical, Chinese content)
Output QualityMediumDraft OK → production → customer-facing critical
LatencyLowBatch OK → real-time interactive

Step 2: Estimate Token Usage

Task ComplexityInput TokensOutput Tokens
Trivial (single Q&A)500 – 2K200 – 1K
Simple (few exchanges)2K – 8K1K – 4K
Medium (multi-turn agent, 5-10 tools)8K – 40K4K – 16K
Complex (deep agent, 10-30 tools)40K – 150K16K – 50K
Heavy (autonomous loop, 30+ tools)150K – 500K+50K – 200K+

Step 3: Run Scripts

# Ensure pricing is fresh
python scripts/fetch_pricing.py

# Get recommendation
python scripts/advise.py --task "<user's task description>"

Step 4: Present Recommendation

Format the output with:

  1. Task complexity analysis
  2. Top 3 model picks with cost
  3. Comparison vs user's current model (if known)
  4. Any pitfalls (R1 reasoning tokens, context window limits, etc.)

Pitfalls to Warn Users About

Script internals (for maintenance):

  • Tier keys in pricing JSON are strings, not intspricing_cache dict uses "1" not 1. The advise script casts them internally, but direct lookups must match.
  • Keyword matching order matters — put longer-specific keywords (e.g., 'production') before shorter ambiguous ones ('pr') to avoid substring false positives. Split on word boundaries.

User-facing pitfalls:

  1. R1/o3 reasoning tokens are hidden: Sticker price hides massive output consumption. Real cost is 3-5× higher for reasoning models.
  2. Context is not free: Models with 1M context (Gemini) charge for every token in the window, used or not.
  3. Tool calls compound cost: Every agentic round-trip adds system prompt + tool definitions + results. An agent task easily 5× the naive estimate.
  4. Cached prefixes save money: System prompts and cached prefixes bill at 10-25% — factor in for repetitive tasks.
  5. Chinese-language tasks: DeepSeek and Qwen outperform their price tier on Chinese content. Western models cost more for equivalent quality.
  6. Pricing changes frequently: Run fetch_pricing.py before important decisions. Cache TTL is 48 hours.

Scripts

  • scripts/fetch_pricing.py — Fetches live pricing from litellm DB, normalizes to canonical model names, caches for 48h.
  • scripts/advise.py — Task complexity analysis + model recommendation engine with colorized terminal output.

Comments

Loading comments...