Token Saver Skill

Smart token cost optimization for OpenClaw. Automatically reduces AI token consumption by 50-80% through intelligent context compression, semantic caching, and adaptive optimization while maintaining response quality.

Audits

Pass

ClawScanReview

Agentic behavior and permission review.

Static analysisPass

Pattern checks against bundled files.

VirusTotalPass

Multi-engine malware detections and file reputation.

Install

openclaw skills install token-saver-skill

TokenSaver

A token cost optimization skill that helps you save 50-80% on AI token usage without sacrificing response quality.

When to Use

Use TokenSaver when:

You have long conversations that consume many tokens
You want to reduce AI API costs
You're working with technical discussions that accumulate context
You notice token usage growing rapidly in long sessions

Core Capabilities

1. Smart Context Compression

Automatically compresses conversation history based on message importance.

How it works:

Recent messages (last 3-5) kept fully intact
Older messages summarized based on importance score
Code blocks and critical decisions never compressed

Savings: 50-70% reduction in context tokens

2. Semantic Cache

Caches responses to similar queries to avoid reprocessing.

How it works:

L1: Exact query match → 100% savings
L2: Semantic similarity > 85% → 80% savings
L3: Pattern match → 50% savings

3. Adaptive Optimization

Automatically adjusts compression based on token pressure.

Stages:

< 3K tokens: No compression
3-6K tokens: Light compression
6-10K tokens: Medium compression
10K tokens: Heavy compression + suggest new chat

Natural Language Commands

When user asks about TokenSaver in natural language, interpret and execute:

Settings & Configuration

User says: "Configure TokenSaver" / "TokenSaver settings" / "Setup TokenSaver" Action: Show current configuration and available options

Current TokenSaver Settings:
- Mode: Adaptive (auto-adjust based on token pressure)
- Compression: Balanced
- Cache: Enabled
- Quality Threshold: 85%

You can change mode:
- /tokensave - Aggressive mode (max savings)
- /tokenbalance - Balanced mode (default)
- /tokenquality - Quality priority (min compression)

User says: "Use aggressive mode" / "Maximize savings" / "Set to save mode" Action: Execute /tokensave command Response: "✅ TokenSaver switched to aggressive save mode. This provides maximum token savings (up to 80%) with slight quality trade-off."

User says: "Use balanced mode" / "Default settings" / "Set to balanced" Action: Execute /tokenbalance command Response: "✅ TokenSaver switched to balanced mode. Good savings (50-70%) with quality preserved."

User says: "Prioritize quality" / "Keep full context" / "Set to quality mode" Action: Execute /tokenquality command Response: "✅ TokenSaver switched to quality priority mode. Minimal compression, maximum context retention for precision-critical tasks."

User says: "Disable TokenSaver" / "Turn off token saving" / "Stop TokenSaver" Action: Execute /tokenoff command Response: "⏸️ TokenSaver temporarily disabled. Re-enable anytime with /tokens or by asking me to turn it back on."

Usage Reports & Analytics

User says: "Show me token usage report" / "TokenSaver statistics" / "Token report" Action: Execute /tokenreport command Response format:

📈 TokenSaver Usage Report
━━━━━━━━━━━━━━━━━━━━━━━━━━━━
This Session:
• Total Original Tokens: 12,450
• Total Optimized Tokens: 4,358
• Tokens Saved: 8,092 (65%)
• Estimated Cost Saved: $0.016
• Cache Hits: 5
• Compressions Applied: 12

Top Savings Sources:
1. Context Compression: -6,200 tokens
2. Semantic Cache: -1,500 tokens
3. Response Optimization: -392 tokens

User says: "TokenSaver status" / "How much am I saving?" / "Token status" Action: Execute /tokens command Response format:

💰 TokenSaver Status
━━━━━━━━━━━━━━━━━━━━
Current Session:
• Savings: 65%
• Original Tokens: 12,450
• Optimized Tokens: 4,358
• Cache Hits: 5

Current Mode: Balanced (Adaptive)
• Stage 3 compression active
• Quality Score: 95%

User says: "Clear token cache" / "Reset cache" / "Clear TokenSaver cache" Action: Execute /tokencache clear command Response: "✅ TokenSaver cache cleared. Fresh responses will be generated for subsequent queries."

Explanation & Help

User says: "What is TokenSaver" / "How does token saving work?" / "Explain TokenSaver" Response:

TokenSaver is an intelligent token cost optimization skill that helps you save 50-80% on AI token usage.

Key Features:
1. Smart Context Compression - Summarizes old conversation while keeping recent messages intact
2. Semantic Cache - Reuses responses for similar questions
3. Adaptive Optimization - Automatically adjusts based on conversation length
4. Quality Guard - Never compresses code, errors, or important messages

Benefits:
• Save 50-80% on token costs
• Faster responses with caching
• No quality loss for critical content
• Automatic - works in background

User says: "TokenSaver not working" / "Why no savings?" / "TokenSaver not saving" Response:

TokenSaver may not show savings if:
1. Conversation is too short (< 3000 tokens) - compression starts at 3K
2. Currently in quality priority mode - switch with /tokensave
3. First-time queries - cache builds up over time
4. Code-heavy conversation - code blocks are never compressed

Try starting a longer technical discussion to see significant savings!

Best Practices

User says: "TokenSaver best practices" / "How to use TokenSaver effectively?" / "TokenSaver tips" Response:

TokenSaver Best Practices:

1. For Debugging/Code Review:
   Use /tokenquality mode to keep full context

2. For Research/Exploration:
   Use /tokensave mode for maximum savings (up to 80%)

3. For General Chat:
   Balanced mode (default) works best

4. When to Start New Chat:
   If switching major topics, start fresh for better compression

5. Monitor with:
   /tokens - Quick status check
   /tokenreport - Detailed analytics

Slash Commands

For direct command access:

/tokens

Show current status and statistics

Session Savings: 65%
Original Tokens: 12,450
Optimized Tokens: 4,358
Cache Hits: 3

/tokensave

Enable aggressive save mode

Maximum compression
Best for very long technical discussions
Slight quality trade-off possible

/tokenbalance

Balanced mode (default)

Good savings with quality preserved
Recommended for most use cases

/tokenquality

Quality priority mode

Minimal compression
Maximum context retention
Use when precision is critical

/tokenreport

Generate detailed usage report

Total Tokens Saved: 8,092
Estimated Cost Saved: $0.016
Compressions Applied: 12
Cache Hits: 5

/tokencache clear

Clear all cached responses

/tokenoff

Temporarily disable optimization

Usage Examples

Example 1: Long coding session

User: [20 rounds of Python discussion]
TokenSaver: Optimized 15K → 4.5K tokens (70% saved)

Example 2: Repeated questions

User: "How do I write to a file in Python?"
User: "Python file write method?"
TokenSaver: L2 cache hit - instant response, 0 tokens used

Example 3: Topic switching

User: Switching from discussing Python to JavaScript...
TokenSaver: "Detected topic change. Start new chat to keep context clean?"
[Yes] [No]

Safety Features

TokenSaver never compresses:

Code blocks (always kept intact)
Error messages and stack traces
User-marked important messages
Messages with high cross-references

Quality Guard:

Auto-rollback if quality drops > 15%
One-click restore to uncompressed version
Snapshots for every compression

Configuration

Default configuration:

{
  "mode": "adaptive",
  "compression": "balanced",
  "cache": true,
  "qualityThreshold": 0.85
}

Expected Results

Conversation Type	Tokens Saved	Quality Impact
Technical discussion (50 rounds)	70%	Minimal
Code review	80%	None
Casual chat	75%	None
Quick Q&A	30-50%	None

Limitations

Requires conversation to exceed 3K tokens before compression starts
First-time queries cannot be cached
Very short conversations (< 10 messages) see minimal benefit
Code-heavy conversations benefit most from smart referencing

Related Skills

shieldclaw: For security scanning
browser_visible: For web browsing
file_reader: For reading local files