Token Optimization

v1.1.0

Reduce OpenClaw per-turn prompt costs by 70%+ through file splitting, prompt caching, context pruning, and model routing. Tested on production setup with 69...

⭐ 0· 547·2 current·2 all-time

by@jack-yang-ai

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Benign

high confidence

ℹ

Purpose & Capability

The skill's purpose (workspace splitting, prompt caching, pruning, model routing) matches the steps in SKILL.md. However, the registry metadata declares no required binaries/configs while the SKILL.md explicitly requires OpenClaw 2026.3.x, the ability to edit ~/.openclaw/workspace files and openclaw.json, and uses CLI commands (e.g., session_status, openclaw gateway restart, wc, mv, cat). That mismatch is operational (the guide assumes an OpenClaw runtime) but not malicious.

Instruction Scope

Instructions tell the operator to read, move, and delete workspace files (~/.openclaw/workspace/*), edit openclaw.json, and restart the gateway. These are expected for this purpose but carry operational risk: deleting/moving BOOTSTRAP.md or other files can change runtime behavior; adding caching/pruning rules that allow 'exec' and 'read' can cause command outputs and file contents to be cached and re-sent to models (potential data exposure). The guide also recommends heartbeat intervals to keep cache warm (operational cost/behavior implications).

✓

Install Mechanism

No install spec or bundled code — instruction-only. This minimizes supply-chain risk because nothing is downloaded or executed by the skill itself.

✓

Credentials

The skill requests no environment variables or external credentials. It references Anthropic/OpenRouter providers but assumes the host already has provider creds configured; it does not ask for new secrets, which is proportional for its stated task.

ℹ

Persistence & Privilege

The skill is not always-enabled and is user-invocable only. It directs changes to local agent config and workspace files (openclaw.json, workspace/*.md), which is normal for an optimization guide but means it will persist changes across runs if you apply them — make backups and test in staging.

Assessment

This is a coherent, instruction-only optimization guide, but take precautions before applying it: 1) Back up your ~/.openclaw/workspace and openclaw.json (don't rely on the mv .bak as your only copy). 2) Test changes in a staging or non-production agent first — splitting/removing BOOTSTRAP.md or shrinking AGENTS.md can unintentionally change behavior. 3) Be cautious about caching command outputs or file contents: allowing 'exec' and 'read' to be cached may store sensitive data in prompts that could later be sent to models or logs. 4) Understand heartbeat recommendations — they keep caches warm but can increase calls/costs. 5) Ensure you have the OpenClaw CLI/tools available (session_status, openclaw gateway restart) — the registry metadata doesn't list these but SKILL.md assumes them. If you want further review, provide your current openclaw.json and a list of workspace files (or test changes on a copy) so I can point out exact edits and potential impacts.

Like a lobster shell, security has layers — review code before you run it.

latestvk979ky93c9grn1jxtqhr51zh8d832gzj

547downloads

0stars

2versions

Updated 1mo ago

v1.1.0

MIT-0

Token Optimization for OpenClaw

Systematic guide to reduce per-turn token consumption by 70%+ without losing any functionality.

When to Use This Skill

session_status shows Context > 30% on simple messages
Cache hit rate is 0% or consistently low
AGENTS.md > 5KB or MEMORY.md > 3KB
You want to cut API costs on Anthropic models

Prerequisites

OpenClaw 2026.3.x or later
Access to edit openclaw.json
At least one Anthropic model configured

Step 1: Audit Current State

Run session_status and record:

Cache: X% hit · Y cached, Z new
Context: Xk/200k (X%)

Then check file sizes:

wc -c ~/.openclaw/workspace/*.md

Red flags:

Any single file > 10KB → needs splitting
Total workspace files > 30KB → bloated
Cache 0% → caching not enabled
Context > 40% on simple message → pruning too loose

Step 2: Split Large Files (Layer 1)

AGENTS.md (biggest offender)

Move infrequently-needed content to separate files:

Content	Move To	Load When
Subagent protocols	`AGENTS_SUBAGENT.md`	Only when spawning
Heartbeat rules	`AGENTS_HEARTBEAT.md`	Only during heartbeat
Detailed examples	`memory/` directory	On demand via `read`

Target: AGENTS.md ≤ 5KB

Keep only: session rules, safety, formatting, quick-reference subagent table.

Add references at the top:

> Subagent protocol → `AGENTS_SUBAGENT.md` (read on demand)
> Heartbeat protocol → `AGENTS_HEARTBEAT.md` (read during heartbeat)

MEMORY.md

Move detailed SOPs and procedures to memory/ subdirectory files. Keep only high-frequency referenced items.

Target: MEMORY.md ≤ 3KB

BOOTSTRAP.md

Delete it after initial setup. It loads every turn for zero value.

mv ~/.openclaw/workspace/BOOTSTRAP.md ~/.openclaw/workspace/BOOTSTRAP.md.bak

Verify

# Sum only files that load every turn
cat ~/.openclaw/workspace/{AGENTS,SOUL,TOOLS,IDENTITY,USER,HEARTBEAT,MEMORY}.md | wc -c
# Target: < 15KB total

Step 3: Enable Prompt Caching (Layer 2)

Add cacheRetention to each Anthropic model in openclaw.json:

{
  "agents": {
    "defaults": {
      "models": {
        "anthropic/claude-opus-4-6": {
          "params": { "cacheRetention": "long" }
        },
        "anthropic/claude-sonnet-4-6": {
          "params": { "cacheRetention": "long" }
        },
        "openrouter/anthropic/claude-3.5-sonnet": {
          "params": { "cacheRetention": "short" }
        }
      }
    }
  }
}

Values

Value	Cache Window	Best For
`none`	No caching	Bursty/notification agents
`short`	~5 minutes	OpenRouter models
`long`	~1 hour	Main agent (recommended)

Provider Support

Provider	Support
Anthropic direct API	✅ Full
OpenRouter `anthropic/*`	✅ Auto cache_control injection
Bedrock Anthropic Claude	✅ Pass-through
Other providers	❌ No effect

Keep-Warm Tip

Pair cacheRetention: "long" with heartbeat at ~55 min intervals to keep cache permanently warm:

"heartbeat": {
  "every": "55m",
  "model": "your/cheap-model"
}

Step 4: Tune Context Pruning (Layer 3)

{
  "agents": {
    "defaults": {
      "contextPruning": {
        "mode": "cache-ttl",
        "ttl": "3m",
        "keepLastAssistants": 2,
        "softTrimRatio": 0.25,
        "hardClearRatio": 0.45,
        "tools": {
          "allow": ["exec", "read", "browser"],
          "deny": ["web_search", "web_fetch"]
        }
      }
    }
  }
}

Parameter Guide

Parameter	Aggressive	Moderate	Conservative
`ttl`	2m	3m	5m
`keepLastAssistants`	1	2	3
`softTrimRatio`	0.20	0.25	0.30
`hardClearRatio`	0.40	0.45	0.50

Tool Deny List

Move large, one-off tool outputs to deny:

web_fetch — page content is large and rarely reused
web_search — search results change every time

Keep frequently reused tools in allow:

exec — command outputs often referenced in follow-up
read — file contents may be discussed across turns
browser — snapshot data may be referenced

Step 5: Optimize Model Routing

Use cheap/free models for low-value tasks:

"heartbeat": {
  "every": "4h",
  "model": "your/free-flash-model"
}

Task	Model Tier	Why
Heartbeat/cron	Free/flash	Simple checks, zero cost
Simple Q&A	Free/flash	Doesn't need intelligence
Medium tasks	Mid-tier	Balance cost and quality
Complex/multi-step	Premium	Worth the investment

Step 6: Validate & Monitor

After applying all changes, restart gateway and check:

openclaw gateway restart

Then send a simple message and run session_status:

Target KPIs

Metric	Target	Check Via
Cache Hit Rate	> 80%	`Cache: X% hit`
Simple Q&A Input	< 20k tokens	`Tokens: X in`
Context (idle)	< 30%	`Context: Xk/200k`
Compactions/day	< 2	`Compactions: X`

Troubleshooting

Symptom	Cause	Fix
Cache still 0%	Model doesn't support caching	Check provider is Anthropic
High cacheWrite every turn	Volatile content in system prompt	Move volatile files to on-demand
Context > 50% quickly	Pruning too loose	Lower `ttl` and `softTrimRatio`
Compactions > 3/day	Long conversations without pruning	Enable `cache-ttl` mode

Summary Checklist

Audit: wc -c on workspace files + session_status
Split: AGENTS.md ≤ 5KB, MEMORY.md ≤ 3KB
Delete: BOOTSTRAP.md (if exists)
Cache: cacheRetention: "long" on Anthropic models
Prune: contextPruning with aggressive settings
Route: Cheap model for heartbeat/simple tasks
Validate: session_status shows cache hits + low context %
Monitor: Weekly review of KPIs

Comments

Loading comments...