Agent Cost Strategy

Tiered model selection and cost optimization strategy for multi-agent AI workflows. Use when orchestrating sub-agents, choosing which model to use for a task...

MIT-0 · Free to use, modify, and redistribute. No attribution required.
0 · 48 · 0 current installs · 0 all-time installs
MIT-0
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
The name/description (tiered model selection and cost optimization for multi-agent workflows) matches the SKILL.md and reference doc. There are no unrelated required binaries, env vars, or config paths; all guidance relates to model choice, caching, heartbeats, and delegation rules.
Instruction Scope
Runtime instructions are limited to operational guidance: selecting model tiers, prompt/caching patterns, heartbeat timing, and delegation patterns for sub-agents. They do not instruct reading arbitrary files, exfiltrating data, or contacting unknown endpoints. The only provider-specific action is a suggested 55-minute heartbeat for Anthropic cache behavior, which is advisory and within scope.
Install Mechanism
There is no install spec and no code files. Being instruction-only means nothing is written to disk or downloaded—this is the lowest-risk install profile.
Credentials
The skill references provider-specific behavior (Anthropic, OpenAI, Google) and mentions Anthropic API keys in the heartbeat guidance, but it does not require or request any environment variables or credentials. This is reasonable for an advice-only skill, but any implementation that automates heartbeats or spawns sub-agents will need provider credentials — users should ensure those are supplied only where appropriate and not leaked.
Persistence & Privilege
always is false and the skill is user-invocable. It does not request elevated persistence or attempt to modify other skills or system-wide settings. Autonomous invocation is allowed (platform default) and acceptable for this type of instruction skill.
Assessment
This skill is an advice document for picking model tiers and optimizing prompt caching — it is internally consistent and doesn't request credentials or install code. Before using it you should: (1) map the example model names to the actual SKUs and pricing of your provider, (2) verify provider cache TTLs and pricing (the 55-minute heartbeat is specific to Anthropic and may be wrong or costly for your setup), (3) avoid putting secrets into cached/shared prompts or sub-agent base prompts, and (4) if you implement automated heartbeats or spawned sub-agents, ensure those processes are audited and given only the minimum provider credentials necessary. If this guidance is converted into code or automation, review that code for credential handling and network endpoints before deploying.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.1.0
Download zip
latestvk97a3wzjf9egx112mz50tg4mjh830qsn

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

SKILL.md

Agent Cost Strategy

A tiered model selection framework for multi-agent workflows. Use the cheapest model that can reliably do the job.

The Tiers

TierUse ForExamples
Fast/CheapSub-agents, background workers, iterative fixes, well-defined single-step tasksClaude Haiku, GPT-4o-mini, Gemini Flash
Mid-tierMain dialogue, day-to-day assistance, moderate complexity tasksClaude Sonnet, GPT-4o, Gemini Pro
PowerfulArchitecture decisions, deep code reviews, hard problems, when cheaper models fail twiceClaude Opus, GPT-4.5, Gemini Ultra

Decision Rules

Use Fast/Cheap when:

  • Task is well-scoped and single-step
  • Input/output is straightforward (fix this test, summarize this, run this command)
  • It's a background/automated task with no user interaction
  • You're running many parallel sub-agents

Use Mid-tier when:

  • Conversational context matters
  • Task requires moderate reasoning or multi-step thinking
  • This is the default for your main assistant session

Use Powerful when:

  • Cheaper models have failed 2+ times on the same problem
  • Making high-stakes architectural decisions
  • Deep code review or security audit
  • The cost of a wrong answer exceeds the cost of the model

Sub-Agent Pattern

When delegating to a sub-agent, default to the cheapest model that fits the task:

Task type               → Model tier
─────────────────────────────────────
Fix failing tests       → Fast/Cheap
Write boilerplate       → Fast/Cheap
Research/search         → Fast/Cheap
Cron/scheduled tasks    → Fast/Cheap (always)
Short replies (hi/ok)   → Fast/Cheap (always)
Build new feature       → Mid-tier
Review PR               → Mid-tier
Architecture            → Powerful
Stuck after 2 tries     → Escalate up one tier

Heartbeat Interval

Set heartbeat to 55 minutes (not 30) when using Anthropic API keys. This keeps the prompt cache warm just under the 1-hour TTL — every heartbeat pays cheap cache-read rates instead of re-writing the full cache.

"heartbeat": { "every": "55m" }

Communication Pattern Rule

Short conversational messages (hi, thanks, ok, sure, got it, yes, no) should always use Fast/Cheap models. Never burn Sonnet or Powerful on one-word acknowledgments.

Cache Optimization

Prompt caching can cut costs by 80-90% on repeated context. See references/cache-optimization.md for patterns.

Tracking

Monitor spend by checking your provider's usage dashboard regularly. Signs you're over-spending:

  • Running Powerful models on tasks Fast/Cheap can handle
  • No caching on repeated system prompts
  • Spawning sub-agents without a model tier strategy
  • Heartbeat set to 30min (re-writes cache every time)

Files

2 total
Select a file
Select a file to preview.

Comments

Loading comments…