Token Budget Guard

v2.0.0

Automatically manages and compresses context to optimize token usage by summarizing, selectively loading, and budgeting for tool schemas, history, and tasks.

⭐ 0· 109·0 current·0 all-time

byErwin@aptratcn

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for aptratcn/token-budget-guard.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "Token Budget Guard" (aptratcn/token-budget-guard) from ClawHub.
Skill page: https://clawhub.ai/aptratcn/token-budget-guard
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install token-budget-guard

ClawHub CLI

Package manager switcher

npx clawhub@latest install token-budget-guard

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Suspicious

medium confidence

✓

Purpose & Capability

Name/description (manage/compress context, budget schemas/history) matches the SKILL.md content and the provided templates. The strategies, progressive disclosure, and examples are coherent with the stated purpose and do not request unrelated services or credentials.

ℹ

Instruction Scope

All runtime instructions are textual guidance (summarize, progressive disclosure, selective file reads) and example shell commands (grep, jq, head). The skill suggests copying SKILL.md into agent skill folders and gives examples of file operations; it does not explicitly instruct reading secrets or external endpoints. Because it advises running shell commands, a deployed agent could use those to access local files — this is expected for a file-selective strategy but increases the surface that should be reviewed.

✓

Install Mechanism

Instruction-only skill with no install spec, no downloads, and no code files — lowest install risk. The README suggests manual copy into agent skill directories (user-level paths), which is normal for instruction-only skills and does not include remote installers or archive extraction.

✓

Credentials

The skill requests no environment variables, no credentials, and no config paths. This aligns with its purely instructional nature; nothing asks for unrelated secrets or external tokens.

ℹ

Persistence & Privilege

always:false and disable-model-invocation:false (normal). The README encourages copying the SKILL.md into agent skill locations, which would persist behavior in the agent environment — this is reasonable for a skill but means you should avoid copying it into shared/global system locations until you trust it.

Scan Findings in Context

[system-prompt-override] unexpected: An automated pattern detector flagged 'system-prompt-override' in the SKILL.md. The skill legitimately discusses 'system prompt' as part of the budget allocation, so this may be a false positive. However, because a prompt-injection-like pattern was detected, it's worth manually checking for any explicit instructions that tell an agent to overwrite its system prompt, ignore safety instructions, or accept arbitrary external prompts.

What to consider before installing

This skill appears to be what it claims — a set of best-practice instructions to reduce token usage — but exercise caution before installing it into a live agent environment. Recommendations: - Review the SKILL.md yourself for any lines that explicitly instruct an agent to overwrite its system prompt, ignore safety rules, or call out to unknown endpoints; the scanner found a possible 'system-prompt-override' pattern which may be a false positive but should be checked. - Do not copy the file into global or shared agent skill directories until you confirm it only contains benign guidance; prefer installing into a sandboxed/test agent first. - If you let an agent follow the examples (jq, grep, head), ensure the agent is not given blanket filesystem access to sensitive directories (e.g., /etc, ~/.ssh, cloud credential files). Limit the agent's file-read scope to project folders. - Monitor the agent's behavior during initial runs: log commands the agent issues and any files it reads, and validate summaries before they are used for privileged actions. - If you need higher assurance, ask the skill author for provenance or a signed release; absent that, keep confidence moderate and treat the skill as useful but requiring manual oversight.

SKILL.md:43

Prompt-injection style instruction pattern detected.

About static analysis

These patterns were detected by automated regex scanning. They may be normal for skills that integrate with external APIs. Check the VirusTotal and OpenClaw results above for context-aware analysis.

Like a lobster shell, security has layers — review code before you run it.

context-managementvk97chwr34qtfr7fj2nabapkjqh85e5wscostvk97chwr34qtfr7fj2nabapkjqh85e5wslatestvk97chwr34qtfr7fj2nabapkjqh85e5wstoken-optimizationvk97chwr34qtfr7fj2nabapkjqh85e5ws

109downloads

0stars

2versions

Updated 4d ago

v2.0.0

MIT-0

Token Budget Guard

Stop burning context. Manage your agent's token budget intelligently.

The Problem

AI agents waste 40-60% of tokens on:

Repeatedly loading full schemas when summaries suffice
Including irrelevant context from previous turns
Not compressing before context window fills
Loading entire files when snippets would do

The AAI Gateway showed 99% token savings are possible. This skill makes token budgeting automatic.

When to Use

"token budget", "reduce tokens", "context too long", "running out of context"
Before multi-tool workflows
When hitting context limits
Optimizing agent workflows for cost efficiency

Core Principles

1. Progressive Disclosure

Level 0: Name only (1-5 tokens) — "browser tool available"
Level 1: Summary (10-30 tokens) — "browser: open/navigate/snapshot web pages"
Level 2: Schema (50-200 tokens) — full parameter descriptions
Level 3: Examples (200-500 tokens) — sample calls with output

Default: Level 1. Escalate only when tool is being used.

2. Summarize Before Including

Previous conversation: summarize, don't replay
File contents: extract relevant sections, don't cat entire files
Tool outputs: compress to decisions + evidence, drop raw data
Error logs: extract error line + 5 lines context, not full stack

3. Budget Allocation

Total context budget: 100%
├── System prompt: 15-20% (fixed)
├── Active task: 40-50% (working space)
├── Tool schemas: 10-15% (progressive)
├── Memory/History: 10-15% (summarized)
└── Reserve: 5-10% (safety margin)

4. Compression Triggers

When context > 60% full → start compressing history
When context > 80% full → aggressive summarization
When context > 90% full → emergency mode (drop all but current task)

Token Saving Strategies

Strategy 1: Schema Stubs

// Instead of full schema (200+ tokens):
// { "name": "web_search", "parameters": { "query": { "type": "string", ... }, ... } }

// Use stub (15 tokens):
// web_search(query) → search results

Strategy 2: Conversation Compression

// Before compression (500 tokens of back-and-forth):
User: Can you find the latest Node.js version?
Agent: I'll search for that. [calls web_search]
Agent: The latest Node.js version is v22.22.2...
User: What about LTS?
Agent: [calls web_search] The current LTS is v22.x...

// After compression (30 tokens):
// Resolved: Node.js latest=v22.22.2, LTS=v22.x, user confirmed.

Strategy 3: Selective File Reading

# Instead of: cat package.json  (often 100+ lines)
# Use: jq '.dependencies | keys' package.json  (just what you need)
# Or: head -5 package.json  (name + version)

Strategy 4: Tool Result Filtering

// Instead of returning full API response (2000 tokens)
// Return structured summary (50 tokens):
// ✅ 3 issues found: 2 bugs (P1, P2), 1 feature request
// Key assignees: @alice, @bob
// No urgent items

Budget Monitoring

Track token usage per task:

### Token Budget Log — Task: "Build API endpoint"
| Action | Tokens | Running Total | Budget % |
|--------|--------|--------------|----------|
| System prompt | 2,000 | 2,000 | 10% |
| Tool schemas (stub) | 500 | 2,500 | 12.5% |
| Read 3 files (selective) | 1,200 | 3,700 | 18.5% |
| Write code | 800 | 4,500 | 22.5% |
| ... | ... | ... | ... |

Quick Wins (Apply Immediately)

Replace full file reads with targeted extraction — grep, jq, awk > cat
Use tool stubs during planning — load full schemas only at execution time
Summarize after every 5 tool calls — don't let raw output accumulate
Set a hard limit — if a single file > 500 lines, read with offset/limit
Drop completed subtask context — keep decision, drop process

Integration with Agent Workflows

Task received → Estimate token need → Allocate budget → Execute with monitoring
                                                       ↓
                                              Budget > 80%? → Compress
                                                       ↓
                                              Budget > 90%? → Emergency summarize

Real Impact

Based on AAI Gateway benchmarks:

Multi-MCP workflows: 99% reduction in schema tokens
Conversation history: 60-80% compressible
File operations: 40-70% savings with selective reading
Overall context efficiency: 3-5x improvement typical

License

MIT

Comments

Loading comments...