Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

claw-compactor

v6.0.0

Claw Compactor v6.0 — 50%+ savings through rule-based compression, dictionary encoding, session observation compression, and progressive context loading.

0· 1.4k·4 current·4 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for aeromomo/cut-your-tokens-97percent-savings-on-session-transcripts-via-observation-extraction.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "claw-compactor" (aeromomo/cut-your-tokens-97percent-savings-on-session-transcripts-via-observation-extraction) from ClawHub.
Skill page: https://clawhub.ai/aeromomo/cut-your-tokens-97percent-savings-on-session-transcripts-via-observation-extraction
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install cut-your-tokens-97percent-savings-on-session-transcripts-via-observation-extraction

ClawHub CLI

Package manager switcher

npx clawhub@latest install cut-your-tokens-97percent-savings-on-session-transcripts-via-observation-extraction
Security Scan
VirusTotalVirusTotal
Suspicious
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
The name/description (token/workspace compression, observation extraction) align with the included code: compression, dictionary, RLE, observation extraction, tiered summaries, and tokenizer optimizers are all present. File accesses (workspace markdown files and a memory/ folder) are expected for this purpose.
!
Instruction Scope
Runtime instructions and code will read session transcripts from ~/.openclaw/sessions and write compressed observations into workspace memory/.observed-sessions.json and memory/observations/. The compressed_context module explicitly produces 'decompression instructions' intended to be prepended to a model/system prompt — recommending edits to system prompts is powerful and can be abused for prompt-injection. The SKILL.md and source contain LLM prompt templates (COMPRESS_PROMPT) and text that could be used to call LLMs if a user enables that path; you should confirm that you do not inadvertently send sensitive session content to an external LLM. Overall the file operations are coherent with the stated purpose but the system-prompt modification guidance and presence of LLM prompts raise safety concerns.
Install Mechanism
No remote download/install spec is present — the skill is provided as code files (pyproject, scripts). Nothing in the manifest points to fetching arbitrary remote binaries or archives. That reduces supply-chain risk compared to an external download.
Credentials
The skill requests no environment variables, no external credentials, and no special config paths in the registry metadata. Its runtime behavior reads local workspace files and the user's session directory (~/.openclaw/sessions), which is proportionate to a session-transcript compressor but is sensitive data access — the absence of credentials is appropriate.
Persistence & Privilege
The skill is not always-enabled and does not request elevated platform privileges. It writes artifacts inside the workspace (memory/.observed-sessions.json, memory/observations/, memory/.codebook.json) which is expected. However the SKILL.md and compressed_context code recommend prepending decompression instructions to system prompts — that would change model behavior if applied; do not modify system prompts automatically without review. Autonomous invocation (default) is normal but combine caution with the instruction-scope concerns above.
Scan Findings in Context
[system-prompt-override] expected: The compressed_context module explicitly returns 'decompression instructions' intended to be prepended to a system prompt so a model can expand compressed notation. This is functionally related to the stated purpose (making compressed context usable by LLMs) but is also the exact pattern that can be used for prompt-injection — treat as potentially dangerous and review carefully before applying to a live agent's system prompt.
[unicode-control-chars] unexpected: Pre-scan flagged unicode-control characters in SKILL.md. The project uses emojis and non-ASCII characters legitimately (CJK handling, emojis in README), but control/unicode-steering characters can be used for hidden instruction injection. Inspect any README/SKILL.md and compress/decompress code for zero-width or control characters before using in an automated/system-prompt context.
What to consider before installing
What to consider before installing: - Coherence: The code implements what the skill claims (workspace compression, session transcript observation extraction). That part is internally consistent. - Sensitive file access: The skill will read session transcripts from ~/.openclaw/sessions and will write memory/.observed-sessions.json and memory/observations/ inside the target workspace. Those session files can contain private data — back up and inspect them first. - Prompt-injection risk: The compressed_context module and SKILL.md suggest prepending decompression instructions to model/system prompts. That is functionally necessary to make compressed notation readable by an LLM, but it is also a high-risk operation because it alters model steering. Do NOT automatically modify your agent’s system prompt or global model configuration without manual review. If you must use decompression instructions, keep them minimal and review their exact text. - LLM use paths: The code contains LLM prompt templates and both rule-based and LLM-driven extraction paths. By default many commands use rule-based functions, but some utilities can generate prompts for LLMs. Before enabling any LLM-based compression, verify there are no unintended network endpoints and confirm which prompts/data would be sent out. - Run safely first: Use the --dry-run/benchmark modes and run on a copy of your workspace. Inspect generated artifacts (memory/.observed-sessions.json, memory/.codebook.json, memory/observations/) and the code paths that would call an LLM (search for code that invokes network libraries or external APIs in the omitted files). - Check for hidden characters: Scan README/SKILL.md and compressed text for zero-width or control characters that could hide instructions (tools like `cat -v`, hexdump, or editors that show invisibles can help). - Review compressed_context behavior: aggressive 'ultra' compression removes many function words and applies abbreviations; ensure the resulting compressed text + decompression instructions preserve all facts you care about and that you trust the decompression guidance. If you are not comfortable auditing the code paths that might call an external LLM or that will modify model prompts, consider running the tool in an isolated environment or container and only using the purely rule-based (no-LLM) modes.

Like a lobster shell, security has layers — review code before you run it.

latestvk978haz4f6gtg0peg6ftde7fjh80xw1x
1.4kdownloads
0stars
2versions
Updated 13h ago
v6.0.0
MIT-0

🦞 Claw Compactor

Claw Compactor Banner

"Cut your tokens. Keep your facts."

Cut your AI agent's token spend in half. One command compresses your entire workspace — memory files, session transcripts, sub-agent context — using 5 layered compression techniques. Deterministic. Mostly lossless. No LLM required.

Features

  • 5 compression layers working in sequence for maximum savings
  • Zero LLM cost — all compression is rule-based and deterministic
  • Lossless roundtrip for dictionary, RLE, and rule-based compression
  • ~97% savings on session transcripts via observation extraction
  • Tiered summaries (L0/L1/L2) for progressive context loading
  • CJK-aware — full Chinese/Japanese/Korean support
  • One command (full) runs everything in optimal order

5 Compression Layers

#LayerMethodSavingsLossless?
1Rule engineDedup lines, strip markdown filler, merge sections4-8%
2Dictionary encodingAuto-learned codebook, $XX substitution4-5%
3Observation compressionSession JSONL → structured summaries~97%❌*
4RLE patternsPath shorthand ($WS), IP prefix, enum compaction1-2%
5Compressed Context Protocolultra/medium/light abbreviation20-60%❌*

*Lossy techniques preserve all facts and decisions; only verbose formatting is removed.

Quick Start

git clone https://github.com/aeromomo/claw-compactor.git
cd claw-compactor

# See how much you'd save (non-destructive)
python3 scripts/mem_compress.py /path/to/workspace benchmark

# Compress everything
python3 scripts/mem_compress.py /path/to/workspace full

Requirements: Python 3.9+. Optional: pip install tiktoken for exact token counts (falls back to heuristic).

Architecture

┌─────────────────────────────────────────────────────────────┐
│                      mem_compress.py                        │
│                   (unified entry point)                     │
└──────┬──────┬──────┬──────┬──────┬──────┬──────┬──────┬────┘
       │      │      │      │      │      │      │      │
       ▼      ▼      ▼      ▼      ▼      ▼      ▼      ▼
  estimate compress  dict  dedup observe tiers  audit optimize
       └──────┴──────┴──┬───┴──────┴──────┴──────┴──────┘
                        ▼
                  ┌────────────────┐
                  │     lib/       │
                  │ tokens.py      │ ← tiktoken or heuristic
                  │ markdown.py    │ ← section parsing
                  │ dedup.py       │ ← shingle hashing
                  │ dictionary.py  │ ← codebook compression
                  │ rle.py         │ ← path/IP/enum encoding
                  │ tokenizer_     │
                  │   optimizer.py │ ← format optimization
                  │ config.py      │ ← JSON config
                  │ exceptions.py  │ ← error types
                  └────────────────┘

Commands

All commands: python3 scripts/mem_compress.py <workspace> <command> [options]

CommandDescriptionTypical Savings
fullComplete pipeline (all steps in order)50%+ combined
benchmarkDry-run performance report
compressRule-based compression4-8%
dictDictionary encoding with auto-codebook4-5%
observeSession transcript → observations~97%
tiersGenerate L0/L1/L2 summaries88-95% on sub-agent loads
dedupCross-file duplicate detectionvaries
estimateToken count report
auditWorkspace health check
optimizeTokenizer-level format fixes1-3%

Global Options

  • --json — Machine-readable JSON output
  • --dry-run — Preview changes without writing
  • --since YYYY-MM-DD — Filter sessions by date
  • --auto-merge — Auto-merge duplicates (dedup)

Real-World Savings

Workspace StateTypical SavingsNotes
Session transcripts (observe)~97%Megabytes of JSONL → concise observation MD
Verbose/new workspace50-70%First run on unoptimized workspace
Regular maintenance10-20%Weekly runs on active workspace
Already-optimized3-12%Diminishing returns — workspace is clean

cacheRetention — Complementary Optimization

Before compression runs, enable prompt caching for a 90% discount on cached tokens:

{
  "models": {
    "model-name": {
      "cacheRetention": "long"
    }
  }
}

Compression reduces token count, caching reduces cost-per-token. Together: 50% compression + 90% cache discount = 95% effective cost reduction.

Heartbeat Automation

Run weekly or on heartbeat:

## Memory Maintenance (weekly)
- python3 skills/claw-compactor/scripts/mem_compress.py <workspace> benchmark
- If savings > 5%: run full pipeline
- If pending transcripts: run observe

Cron example:

0 3 * * 0 cd /path/to/skills/claw-compactor && python3 scripts/mem_compress.py /path/to/workspace full

Configuration

Optional claw-compactor-config.json in workspace root:

{
  "chars_per_token": 4,
  "level0_max_tokens": 200,
  "level1_max_tokens": 500,
  "dedup_similarity_threshold": 0.6,
  "dedup_shingle_size": 3
}

All fields optional — sensible defaults are used when absent.

Artifacts

FilePurpose
memory/.codebook.jsonDictionary codebook (must travel with memory files)
memory/.observed-sessions.jsonTracks processed transcripts
memory/observations/Compressed session summaries
memory/MEMORY-L0.mdLevel 0 summary (~200 tokens)

FAQ

Q: Will compression lose my data? A: Rule engine, dictionary, RLE, and tokenizer optimization are fully lossless. Observation compression and CCP are lossy but preserve all facts and decisions.

Q: How does dictionary decompression work? A: decompress_text(text, codebook) expands all $XX codes back. The codebook JSON must be present.

Q: Can I run individual steps? A: Yes. Every command is independent: compress, dict, observe, tiers, dedup, optimize.

Q: What if tiktoken isn't installed? A: Falls back to a CJK-aware heuristic (chars÷4). Results are ~90% accurate.

Q: Does it handle Chinese/Japanese/Unicode? A: Yes. Full CJK support including character-aware token estimation and Chinese punctuation normalization.

Troubleshooting

  • FileNotFoundError on workspace: Ensure path points to workspace root (contains memory/ or MEMORY.md)
  • Dictionary decompression fails: Check memory/.codebook.json exists and is valid JSON
  • Zero savings on benchmark: Workspace is already optimized — nothing to do
  • observe finds no transcripts: Check sessions directory for .jsonl files
  • Token count seems wrong: Install tiktoken: pip3 install tiktoken

Credits

License

MIT

Comments

Loading comments...