Token Saver 75+
v1.0.0Automatically classifies requests to optimize cost by routing to the cheapest capable model and applies maximum output compression for 75%+ token savings.
Security Scan
OpenClaw
Suspicious
medium confidencePurpose & Capability
The name/description (token savings + model routing) match the SKILL.md: it defines tiering, routing rules, and compact output templates. It is instruction-only and does not request credentials or binaries, which is proportionate. One caveat: it assumes a runtime that supports sessions_spawn and the specific model IDs; those model IDs may not exist in every environment and must be adapted.
Instruction Scope
The SKILL.md instructs the agent to classify every incoming request and to spawn other model sessions with 'ALL context' included in the spawn task string. It also (in README) explicitly recommends making the skill 'mandatory' by adding a line to the system prompt so the skill is enforced every session. This is effectively a prompt-injection attempt that expands the skill's scope beyond normal user-invocable behavior and can cause sensitive conversation context to be forwarded to other models.
Install Mechanism
There is no install spec and no code files — it's instruction-only. That is low-risk from an installation/executable standpoint (nothing is written to disk).
Credentials
The skill requests no environment variables, credentials, or file paths. The routing table references third-party model IDs, but those are configuration-level mapping decisions rather than secret requests; the lack of requested credentials is proportionate to the stated purpose.
Persistence & Privilege
Although the skill metadata does not set always:true, the README encourages operators to add mandatory instructions to the system prompt (i.e., make it always-on). That recommendation is an attempt to escalate the skill's persistence/privilege beyond the declared metadata. Combined with the instruction to include SKILL.md every session, this is a persistence/prompt-injection concern.
Scan Findings in Context
[system-prompt-override] unexpected: The SKILL.md / README contains explicit guidance to add a mandatory line to the system prompt and to read the SKILL.md every session. This is a prompt-injection pattern and is not necessary for a token-optimization policy to function as a user-invocable skill. It attempts to persistently influence agent behavior by altering system prompts.
What to consider before installing
What to consider before installing:
- Do not add this skill to your system prompt or otherwise make it mandatory without a careful review. The README explicitly recommends forcing the skill into every session — that's a prompt-injection/persistence escalation and should be avoided.
- Because the SKILL.md tells spawned sessions to include "ALL context," verify how your platform implements sessions_spawn. Confirm that sensitive or private context will not be sent to third-party models or logs unintentionally.
- The skill references model IDs (openai/..., anthropic/..., groq/...). Map those to your environment carefully and be aware of cost implications; routing could increase usage of expensive models if misconfigured.
- Test in an isolated, auditable environment first: limit the skill to user-invocable (not always-on), enable request/response logging, and monitor which models are actually invoked and what context is forwarded.
- If you like the behavior, implement the protocol in a reviewed internal policy rather than blindly following the README's "mandatory" instruction. Remove or neutralize the recommendation to edit system prompts.
- Consider restricting autonomous invocation for this skill (or requiring explicit human approval for spawns) until you're confident it won't leak data or escalate costs.
If you want a safer path forward: ask the author for clarification on why they recommend forcing the skill into system prompts, request a version that does not require including full context in spawn tasks, and get explicit documentation on what gets forwarded to spawned models. Because the SKILL.md includes a prompt-injection signal, proceed cautiously and prefer a manual opt-in model.Like a lobster shell, security has layers — review code before you run it.
latest
Token Saver 75+ with Model Routing
Core Principle
Understand fully, execute cheaply. The orchestrator must fully understand the task before routing. Never sacrifice comprehension for speed.
Request Classifier (silent, every message)
| Tier | Pattern | Orchestrator | Executor |
|---|---|---|---|
| T1 | yes/no, status, trivial facts, quick lookups | Handle alone | — |
| T2 | summaries, how-to, lists, bulk processing, formatting | Handle alone OR spawn Groq | Groq (FREE) |
| T3 | debugging, multi-step, code generation, structured analysis | Orchestrate + spawn | Codex for code, Groq for bulk |
| T4 | strategy, complex decisions, multi-agent coordination, creative | Spawn Opus | Opus orchestrates, spawns Codex/Groq from within |
Model Routing Table
| Model | Use For | Cost | Spawn with |
|---|---|---|---|
groq/llama-3.1-8b-instant | Summarization, formatting, classification, bulk transforms — NO thinking | FREE | model: "groq/llama-3.1-8b-instant" |
openai/gpt-5.3-codex | ALL code generation, code review, refactoring | $$$ | model: "openai/gpt-5.3-codex" |
openai/gpt-5.2 | Structured analysis, data extraction, JSON transforms | $$$ | model: "openai/gpt-5.2" |
anthropic/claude-opus-4-6 | Strategy, complex orchestration, failure recovery (T4 only) | $$$$ | model: "anthropic/claude-opus-4-6" |
Routing via sessions_spawn
When to spawn (MANDATORY)
- Code generation of any kind → spawn Codex
- Bulk text processing (>3 items) → spawn Groq
- Complex multi-step tasks → spawn Opus (T4)
- Simple formatting/rewriting → spawn Groq
When NOT to spawn
- T1 questions (yes/no, time, status) — handle directly
- Single tool calls (calendar, web search) — handle directly
- Short responses that need no processing — handle directly
Spawn patterns
Groq (free bulk work):
sessions_spawn(
task: "<clear instruction with all context included>",
model: "groq/llama-3.1-8b-instant"
)
Codex (all code):
sessions_spawn(
task: "Write <language> code that <detailed spec>. Include comments. Output the complete file.",
model: "openai/gpt-5.3-codex"
)
Opus (T4 strategy):
sessions_spawn(
task: "<full context + goal>. You have full tool access. Use sessions_spawn with Codex for code and Groq for bulk subtasks.",
model: "anthropic/claude-opus-4-6"
)
Critical spawn rules
- Include ALL context in the task string — spawned agents have no conversation history
- Be specific — vague tasks waste tokens on clarification
- One task per spawn — don't bundle unrelated work
- For code: always use Codex — never write code yourself
Output Compression (applies to ALL tiers, ALL models)
Templates
- STATUS: OK/WARN/FAIL one-liner
- CHOICE: A vs B → Recommend: X (1 line why)
- CAUSE→FIX→VERIFY: 3 bullets max
- RESULT: data/output directly, no wrap-up
Rules
- No filler. No restating the question. Lead with the answer.
- Bullets/tables/code > prose.
- Do not narrate routine tool calls.
- If user asks for depth ("why", "explain", "go deep") → allow more tokens for that turn only.
Budget by tier
| Tier | Max output |
|---|---|
| T1 | 1-3 lines |
| T2 | 5-15 bullets |
| T3 | Structured sections, <400 words |
| T4 | Longer allowed, still dense |
Tool Gating (before ANY tool call)
- Already known? → No tool.
- Batchable? → Parallelize.
- Can a spawned Groq handle it? → Spawn instead of doing it yourself.
- Cheapest path? → memory_search > partial read > full read > web.
- Needed? → Do not fetch "just in case."
Failure Protocol
- If Groq spawn fails → retry with GPT-5.2
- If Codex spawn fails → retry with GPT-5.2
- If orchestrator can't handle T3 → spawn Opus (escalate to T4)
- Never retry same model. Escalate.
Measurement (when asked or during testing)
Append: [~X tokens | Tier: Tn | Route: model(s) used]
Comments
Loading comments...
