Skill Distiller (Reference)

v0.2.1

Complete skill compression documentation — all options, modes, and calibration details (~2,500 tokens).

⭐ 0· 65·0 current·0 all-time

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for leegitw/neon-skill-distiller-reference.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "Skill Distiller (Reference)" (leegitw/neon-skill-distiller-reference) from ClawHub.
Skill page: https://clawhub.ai/leegitw/neon-skill-distiller-reference
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install neon-skill-distiller-reference

ClawHub CLI

Package manager switcher

npx clawhub@latest install neon-skill-distiller-reference

Security Scan

Capability signals

Requires sensitive credentials

These labels describe what authority the skill may exercise. They are separate from suspicious or malicious moderation verdicts.

VirusTotal

Benign

View report →

OpenClaw

Suspicious

medium confidence

ℹ

Purpose & Capability

The skill's stated purpose (compress/ distill other skills) aligns with the SKILL.md content: parsing sections, scoring importance, and pruning. However, the doc assumes access to an LLM provider (ollama/local, Gemini, OpenAI) for classification and summarization; the registry metadata did not declare any required provider credentials or binaries. Asking for LLM access is reasonable for the task, but it should be declared explicitly.

Instruction Scope

Runtime instructions tell the agent to run 'ollama list' and to check GEMINI_API_KEY and OPENAI_API_KEY environment variables to auto-detect providers. The skill's declared requirements list no binaries, no env vars, and no config paths, so the SKILL.md references system state (commands and env vars) that are not declared. This is scope creep and should be made explicit.

✓

Install Mechanism

No install spec and no code files — lowest-risk delivery model. Nothing will be written to disk by an installer.

Credentials

The instructions reference GEMINI_API_KEY and OPENAI_API_KEY (and imply running ollama) even though requires.env and primary credential are empty. Requesting LLM credentials is proportionate to the task, but those env vars should be listed in the skill metadata so users know what secrets the skill may access.

✓

Persistence & Privilege

always:false and disable-model-invocation:true; the skill does not request persistent presence or elevated platform privileges. No indications it will modify other skills or system settings.

What to consider before installing

This is an instruction-only reference for compressing skills and is broadly coherent with that purpose, but it references running 'ollama list' and checking GEMINI_API_KEY / OPENAI_API_KEY even though the skill metadata lists no required binaries or environment variables. Before installing or using this skill: (1) confirm how your agent will execute the provider-detection steps — will it run local commands like 'ollama list'? (2) don't expose API keys unless you understand and trust where they'll be used; ask the maintainer to add required env vars/binaries to the metadata so the scope is explicit; (3) test with dry-run/--dry-run and with non-sensitive sample skills first; and (4) if your platform enforces env-var access controls, ensure those protections are in place. If you need higher assurance, request the author clarify how LLM calls are performed given disable-model-invocation:true and add declared requirements to the registry entry.

Like a lobster shell, security has layers — review code before you run it.

compressionvk973xf99d698f283b718rn5pdx84xp5qcontext-windowvk973xf99d698f283b718rn5pdx84xp5qdocumentationvk973xf99d698f283b718rn5pdx84xp5qlatestvk973xf99d698f283b718rn5pdx84xp5qopenclawvk973xf99d698f283b718rn5pdx84xp5qoptimizationvk973xf99d698f283b718rn5pdx84xp5qreferencevk973xf99d698f283b718rn5pdx84xp5qskillsvk973xf99d698f283b718rn5pdx84xp5qtoken-reductionvk973xf99d698f283b718rn5pdx84xp5q

65downloads

0stars

1versions

Updated 1w ago

v0.2.1

MIT-0

Skill Distiller (Reference)

Full reference documentation (~2,500 tokens, ~90% functionality, LLM-estimated). This is the canonical source for all compressed variants.

Note: This reference document intentionally exceeds the 300-line MCE guideline. As the canonical source for all compressed variants, comprehensiveness is prioritized over brevity. For compact variants, see the main SKILL.md (~400 tokens, formula) or compressed (~975 tokens, prose).

Agent Identity

Role: Help users compress verbose skills to reduce context window usage Understands: Skills are verbose for human clarity but costly for context; compression is a trade-off Approach: Identify section types, score importance, remove/shorten low-value sections Boundaries: Preserve functionality, report what was removed, never hide trade-offs Tone: Technical, precise, transparent about trade-offs

Data handling: This skill operates within your agent's trust boundary. All analysis uses your agent's configured model. No external APIs beyond your agent's LLM provider.

When to Use

Activate this skill when the user asks:

"Compress this skill"
"Make this skill smaller"
"Distill this skill to X tokens"
"What can I remove from this skill?"
"Reduce skill context usage"

Options

Flag	Default	Description
`--mode`	`threshold`	Compression mode: threshold, tokens, oneliner
`--threshold`	`0.9`	Functionality preservation target (0.0-1.0)
`--tokens`	-	Target token count (for tokens mode)
`--provider`	`auto`	LLM provider: auto, ollama, gemini, openai
`--model`	-	Specific model (e.g., llama3.2, gemini-2.0-flash)
`--verbose`	`false`	Show section-by-section analysis
`--dry-run`	`false`	Analyze without outputting compressed skill
`--debug-stages`	`false`	Show intermediate stage outputs

Provider auto-detection (in order):

Check ollama availability via ollama list
Check GEMINI_API_KEY environment variable
Check OPENAI_API_KEY environment variable
Error if none available

Process

1. Provider Detection

IF ollama available → use ollama (local, fast)
ELIF GEMINI_API_KEY set → use gemini
ELIF OPENAI_API_KEY set → use openai
ELSE → Error: "No LLM provider available. Run 'ollama serve' for local inference, or set GEMINI_API_KEY for cloud."

2. Parse Skill

Parse input skill into sections:

Frontmatter (YAML header with name, description)
Headers (##, ### sections)
Code blocks (examples, output format)
Lists (triggers, constraints)
Prose (explanations)

3. Classify Sections

Classify each section using LLM:

Type	Importance	Compressible?
TRIGGER	1.0 (required)	No
CORE_INSTRUCTION	1.0 (required)	No
CONSTRAINT	0.9	Partially
OUTPUT_FORMAT	0.8	Partially
ADVISORY_PATTERN	0.7	Yes, with warning
EXAMPLE	0.5	Yes (reduce count)
EDGE_CASE	0.4	Yes (summarize)
EXPLANATION	0.3	Yes (remove)
VERBOSE_DETAIL	0.2	Yes (remove first)

Protected patterns (boost to 0.85+):

YAML name/description (REQUIRED BY SPEC)
Task creation references
N-count tracking (N=1, N=2, N≥3)
Checkpoint/state recovery
BEFORE/AFTER markers

3.5. Token-Level Importance (LLMLingua/Selective Context)

For compressible sections (EXAMPLE, EDGE_CASE, EXPLANATION, VERBOSE_DETAIL), apply token-level scoring based on self-information theory:

Principle: self_info(token) = -log(P(token|context))

HIGH self-information = surprising/important → KEEP
LOW self-information = predictable/removable → PRUNE

Scoring Prompt:

Analyze this section for compression. Classify phrases as:
- ESSENTIAL: Specific values, unique terms, key constraints, surprising info
- REDUNDANT: Could be inferred, restates earlier content, filler words

Section type: {type}
Content: {content}

Return JSON: {"essential": [...], "redundant": [...], "compression_potential": 0.0-1.0}

Pruning Rules:

Never prune ESSENTIAL tokens
Remove REDUNDANT phrases while preserving sentence structure
If >50% is REDUNDANT, consider removing entire section
Preserve specific values (numbers, thresholds, error codes)

Example:

Input:  "The system will then proceed to process the input data and return the results"
Output: "process input data → return results"
Removed: "The system will then proceed to", "and"

4. Apply Compression

All modes use MetaGlyph native symbols (see §Symbol Reference below):

Replace "results in" → →
Replace "implies" → ⇒
Replace "for all" → ∀
Replace "not" → ¬

Threshold Mode (default):

Sort sections by importance (descending)
Include sections until functionality target reached
Apply symbol substitutions
Generate compressed markdown

Token-Target Mode:

Calculate minimum tokens (triggers + core)
If target < minimum: attempt LLM summarization
Add sections by importance until target
Apply symbol substitutions

One-Liner Mode:

Extract trigger conditions
Extract core action
Extract expected result
Format as 3-line summary with symbols

4.1. Example Compression (RECOMP)

For EXAMPLE sections, use extractive + abstractive compression:

Phase 1: Extractive Selection (preferred)

Score each example for pattern coverage, uniqueness, clarity
Select top 1-2 examples (keep full detail)
Compress remaining examples to one-liners (don't discard)
If coverage ≥ 0.8 → use selected + one-liners

Phase 1.5: One-Liner Compression (for non-selected examples) Instead of discarding, compress to: {trigger} → {result}

Use MetaGlyph symbols
Preserves coverage, reduces tokens
Group under "### One-liners:" heading

Output structure:

### Full (selected):
[Detailed example with steps...]

### One-liners (compressed):
- `--mode=tokens` → hard token limit
- `--verbose` → section-by-section analysis

Phase 2: Abstractive Fallback (if extractive + one-liners < 0.8)

Generate summary example combining key patterns
Format: {generalized_pattern} → {expected_result}
Preserve specific values where critical

Selection Prompt:

Select the 1-2 BEST examples that cover the most patterns with minimum redundancy.

Examples:
{numbered_list}

Return JSON: {
  "selected": [indices],
  "coverage": 0.0-1.0,
  "rationale": "..."
}

Synthesis Prompt (if coverage < 0.8):

Compress these examples into 1-2 summary examples.
Preserve: specific values, key patterns, expected outcomes.
Format: "When X → Do Y → Expect Z"

{examples}

Output shows: Examples: 5 → 2 full + 3 one-liners (extractive) or Examples: 5 → 1 (abstractive)

5. Measure Functionality

Evaluate by semantic understanding, NOT metrics.

Wrong (Metrics)	Right (Semantic)
"60% line reduction is too aggressive"	"Can an agent still execute this skill?"
"Token count exceeds target"	"Are all triggers and actions preserved?"
"Ratio doesn't match threshold"	"Would an agent behave the same way?"

LLM evaluates preservation by asking:

Can all original triggers still activate?
Are all core actions still specified?
Are critical constraints preserved?
Would an agent behave the same way?

Score 0-100 reflects semantic capability preservation, not line/token ratios. A skill compressed to 40% of original size can still preserve 95% functionality if the removed content was verbose explanation, redundant examples, or non-essential formatting.

6. Save Calibration

After compression, save entry to .learnings/skill-distiller/calibration.jsonl:

{
  "id": "c[N]",
  "timestamp": "[ISO 8601]",
  "skill": "[skill name from frontmatter]",
  "mode": "[threshold|tokens|oneliner]",
  "threshold": 0.9,
  "input_tokens": 1800,
  "output_tokens": 1100,
  "reduction_pct": 39,
  "sections_total": 14,
  "sections_kept": 9,
  "sections_removed": 5,
  "classification_confidence_mean": 0.90,
  "functionality_score": 90,
  "protected_patterns_found": ["n-count"],
  "protected_patterns_preserved": ["n-count"],
  "advisory_patterns_found": [],
  "advisory_patterns_removed": [],
  "expected": {"functionality": 90},
  "actual": null
}

File rotation: If entries > 1000, truncate oldest 100 before appending.

7. Output Result

Return compressed skill with metadata.

Output

Standard Output

Functionality preserved: 90% (uncalibrated - first 5 compressions build baseline)
Tokens: 2000 → 1800 (10% reduction)
Classification confidence: 0.87 (mean across sections)

Removed: 3 examples, 2 edge cases
Kept: all triggers, core instructions, constraints

[Compressed skill markdown follows...]

Verbose Output (`--verbose`)

Section Analysis:
  ## When to Use: TRIGGER (1.0, confidence: 0.95) → KEPT
  ## Process: CORE_INSTRUCTION (1.0, confidence: 0.92) → KEPT
  ## Examples: EXAMPLE (0.5, confidence: 0.88) → RECOMP
    └─ Examples: 5 → 2 full + 3 one-liners (coverage: 0.95)
  ## Edge Cases: EDGE_CASE (0.4, confidence: 0.85) → SUMMARIZED
  ## Technical Details: VERBOSE_DETAIL (0.2) → TOKEN-SCORED
    └─ Tokens: 150 → 45 (redundant phrases removed)

Token-level analysis:
  VERBOSE_DETAIL sections: 3 analyzed, 67% average compression
  Phrases removed: "The system will then", "proceed to", "in order to"

RECOMP example analysis:
  Original examples: 5
  Full (selected): [0, 4] (basic usage, error handling)
  One-liners: [1, 2, 3] (threshold, tokens, verbose options)
  Coverage: 0.95 (full + one-liners)
  Mode: extractive

Protected patterns found: none
Advisory patterns found: parallel-decision → removed (no score penalty)

[Compressed skill markdown follows...]

One-Liner Output

TRIGGER: [activation conditions]
ACTION: [core behavior]
RESULT: [expected output]

Example with MetaGlyph symbols:

TRIGGER: user asks "compress skill" ∨ "distill" ∨ "reduce context"
ACTION: parse → classify sections → score importance → ¬low-value → output
RESULT: compressed skill ∧ functionality% ∧ token reduction stats

Compression Modes

Mode 1: Threshold (Default)

Preserve X% of functionality, compress as much as possible.

/skill-distiller path/to/skill.md --threshold=0.9

Understanding thresholds: The threshold (0.9 = 90%) refers to semantic functionality, not token/line ratios.

Threshold	Meaning	NOT
0.95	95% of capabilities preserved	95% of lines kept
0.90	90% of semantic function	90% of tokens
0.80	80% of agent behavior	80% of bytes

A 0.9 threshold can result in 50%+ line reduction if the removed content was verbose examples, redundant explanations, or formatting. Judge quality by semantic analysis, not size ratios.

Why 0.9 default: Skill functionality is normally distributed across sections. Wide tails mean some "low importance" sections occasionally carry critical value for edge cases. At 0.9, you preserve more of the tail while still achieving meaningful compression.

Mode 2: Token Target

Compress to exact token budget.

/skill-distiller path/to/skill.md --tokens=500

Token estimation: Uses 4 chars/token heuristic. Accuracy: +/-20% vs actual provider tokenization. For precise limits, verify with provider's tokenizer.

Mode 3: One-Liner

Extreme compression for quick reference.

/skill-distiller path/to/skill.md --mode=oneliner

Produces 3-line summary: TRIGGER/ACTION/RESULT

Protected Patterns

These patterns must be preserved even if they look verbose:

Pattern	Why Protected
YAML `name`/`description`	REQUIRED by Agent Skills spec
Task creation	Compaction resilience
N-count tracking	Observation workflow
Checkpoint/state	State recovery
BEFORE/AFTER	Self-calibration

If a protected pattern is removed, the functionality score is penalized (-10% per pattern) and flagged explicitly in output.

Advisory Patterns

These patterns improve efficiency but aren't required:

Pattern	Impact if Removed
Parallel/serial decision	Suboptimal execution order
Performance hints	Slower but functional
Caching guidance	Works but inefficient

Advisory patterns removed are warned but don't penalize the functionality score.

Symbol Reference (MetaGlyph)

Native mathematical symbols LLMs understand from pre-training. Zero legend overhead.

Symbol	Replaces	Example
`→`	results in, leads to, produces	`trigger → action`
`⇒`	implies, therefore, thus	`condition ⇒ behavior`
`∈`	belongs to, is in, member of	`value ∈ {a, b, c}`
`∀`	for all, for every, for each	`∀ files: validate`
`¬`	not, doesn't, isn't	`¬empty → process`
`∃`	there exists, there is	`∃ config → load`
`∧`	and, also, plus	`valid ∧ safe → proceed`
`∨`	or, either	`error ∨ timeout → retry`

Application: Symbols replace verbose phrases in compressed output. No legend needed in output—LLMs comprehend natively.

Research: Based on MetaGlyph (arXiv:2601.07354) - 62-81% compression with native symbols.

Calibration

Storage: .learnings/skill-distiller/calibration.jsonl

Each compression run is logged with expected functionality score. User feedback updates the actual field, enabling calibration over time.

Calibration Stages

N-count	Output	Meaning
N < 5	`90% (uncalibrated - first 5 compressions build baseline)`	LLM-only estimate
N = 5-10	`88% (building baseline, N=7)`	Gathering data
N > 10	`88% +/- 4% (calibrated, N=12)`	Historical data informs CI

Feedback Recording

After using a compressed skill, report actual outcome to improve future estimates:

/skill-distiller feedback --id=c1 --actual=85 --outcome="worked"

Outcome values: worked, partial, failed

This updates the calibration entry, enabling the system to learn from real-world usage.

Viewing Calibration Data

cat .learnings/skill-distiller/calibration.jsonl | jq -s 'length'  # Count entries
cat .learnings/skill-distiller/calibration.jsonl | jq -s 'map(select(.actual != null))' # Entries with feedback

Self-Compression

This skill can compress itself (meta-recursive).

Guardrails:

Higher threshold: Require 95% functionality (not 90%)
Recursion check: Detect self-referential compression
Preserve original: Output to SKILL.compressed.md, never overwrite
Manual verification: Require human approval

Why 0.95 threshold: The distiller must remain fully functional to distill other skills. Capability loss compounds (0.95 x 0.95 = 0.90 at next level).

Enforcement: The 0.95 threshold is a documented guardrail, not an automated check. When compressing skill-distiller variants, manually verify you're using --threshold=0.95 or higher. The formula variant (SKILL.md) should never be the input for self-compression.

Error Handling

Error	Recovery Hint
No content	Provide a valid SKILL.md file path or pipe content via stdin
No frontmatter	Add `---` block with `name` and `description`
No trigger section	Add '## When to Use' for best results
Token target impossible	Use --mode=oneliner for extreme compression
LLM unavailable	Run 'ollama serve' for local, or set GEMINI_API_KEY
Already minimal	No compression possible at threshold Y

Future

Planned features not yet implemented:

Flag	Description	Status
`--with-ci`	Calculate confidence interval (3x LLM calls)	Planned

Variant	Tokens	Functionality
skill-distiller (main)	~400	~90% (formula)
compressed	~975	~90% (prose)
oneliner	~100	~70%
reference (this)	~2,500	~90%

Token counts are estimates using 4 chars/token heuristic. Functionality scores are LLM-estimated.

LLM Compression Research - MetaGlyph, LLMLingua, and 11 other techniques surveyed
skill-compression-support.md - CLI-based compression (Option B)
Agent Skills Spec - Required fields, size constraints

Comments

Loading comments...

Skill Distiller (Reference)

Install

Install with OpenClaw

CLI Commands

Skill Distiller (Reference)

Agent Identity

When to Use

Options

Process

1. Provider Detection

2. Parse Skill

3. Classify Sections

3.5. Token-Level Importance (LLMLingua/Selective Context)

4. Apply Compression

4.1. Example Compression (RECOMP)

5. Measure Functionality

6. Save Calibration

7. Output Result

Output

Standard Output

Verbose Output (--verbose)

One-Liner Output

Compression Modes

Mode 1: Threshold (Default)

Mode 2: Token Target

Mode 3: One-Liner

Protected Patterns

Advisory Patterns

Symbol Reference (MetaGlyph)

Calibration

Calibration Stages

Feedback Recording

Viewing Calibration Data

Self-Compression

Error Handling

Future

Related

Comments

Verbose Output (`--verbose`)