Context Builder

Generate LLM-optimized codebase context from any directory using context-builder CLI

MIT-0 · Free to use, modify, and redistribute. No attribution required.

⭐ 0 · 551 · 0 current installs · 0 all-time installs

by@igorls

MIT-0

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Benign

high confidence

✓

Purpose & Capability

The skill's name/description (generate LLM-optimized context from a directory) matches the instructions: it calls a context-builder CLI, describes filters, signatures, and token budgeting. The install guidance (cargo install with tree-sitter feature or GitHub releases) is consistent with a CLI that needs language parsers. There are no unrelated env vars, binaries, or config paths requested.

ℹ

Instruction Scope

The SKILL.md explicitly instructs reading file contents from an input directory (central to the skill). It also documents safeguards (auto-ignore heavy dirs, respects .gitignore, binary detection) and warns that outputs may contain embedded secrets and should be reviewed. This is appropriate for the purpose, but agent workflows that use -y to skip prompts can cause automatic, unreviewed ingestion if the agent is fed an overly broad path — exercise caution and ensure explicit, scoped paths.

ℹ

Install Mechanism

There is no registry install spec in the skill bundle (instruction-only), but SKILL.md directs installation via cargo (crates.io) or downloading prebuilt binaries from GitHub Releases with SHA256 checksums. Both sources are standard; building requires the Rust toolchain and network access. Recommend verifying checksums and preferring distro packages or pinned release artifacts if available.

✓

Credentials

The skill requests no environment variables or credentials. While the tool will read repository files (which may contain secrets), the declared requirements are minimal and proportionate to the stated functionality.

✓

Persistence & Privilege

The skill does not request 'always: true' and has no install-time persistence specified. It does not ask to modify other skills or system-wide settings. Normal autonomous invocation is allowed by default (disable-model-invocation is false), which is expected for an agentic CLI helper.

Assessment

This skill appears coherent for producing LLM-ready snapshots of projects. Before installing or running it: 1) only point it at explicit project roots (avoid ~, /etc, ~/.ssh, ~/.aws, or other credential stores); 2) prefer running a token-count dry-run or using filters (--token-count, -f) and review outputs before sharing (the tool can surface embedded keys/secrets); 3) verify prebuilt binary checksums from the GitHub Releases page or build from source in a sandboxed environment; 4) be cautious with automated agent runs that use -y (they skip confirmations) — require explicit absolute paths in agent prompts to avoid accidental wide scans.

Like a lobster shell, security has layers — review code before you run it.

Current versionv0.8.3-1

Download zip

latestvk97cmcx6x21s71ky08n74jzcj1818ng7

License

MIT-0

Free to use, modify, and redistribute. No attribution required.

Termshttps://spdx.org/licenses/MIT-0.html

SKILL.md

Context Builder — Agentic Skill

Generate a single, structured markdown file from any codebase directory. The output is optimized for LLM consumption with relevance-based file ordering, AST-aware code signatures, automatic token budgeting, and smart defaults.

Installation

# Requires Rust toolchain. Builds from source with cryptographic verification via crates.io.
cargo install context-builder --features tree-sitter-all

Pre-built binaries with SHA256 checksums are also available for manual download from GitHub Releases.

Verify: context-builder --version (expected: 0.8.3)

Security & Path Scoping

IMPORTANT: This tool reads file contents from the specified directory. Agents MUST follow these rules:

Only target explicit project directories — always pass the exact project root (e.g., /home/user/projects/myapp). Never point at home directories, system paths, or credential stores (~/.ssh, ~/.aws, /etc, ~, /)
Use scoped filters — use -f to limit to known source extensions (e.g., -f rs,toml,md), reducing exposure surface
Output to project-local paths — write output to the project's docs/ folder or /tmp/, never to shared or public locations
Review before sharing — the output may contain API keys, secrets, or credentials embedded in source files; always review or use .gitignore patterns to exclude sensitive files

Built-in protections (always active, no configuration needed):

Excludes .git/, node_modules/, and 19 other heavy/sensitive directories at any depth
Respects .gitignore rules when a .git directory is present
Binary files are auto-detected and skipped via UTF-8 sniffing
Output file and cache directory are auto-excluded to prevent self-ingestion

When to Use

Deep code review — Feed an entire codebase to an LLM for architecture analysis or bug hunting
Onboarding — Generate a project snapshot for understanding unfamiliar codebases
Diff-based updates — After code changes, generate only the diffs to update an LLM's understanding
AST signatures — Extract function/class signatures for token-efficient structural understanding
Cross-project research — Quickly package a dependency's source for analysis

Core Workflow

1. Quick Context (whole project)

context-builder -d /path/to/project -y -o context.md

-y skips confirmation prompts (recommended for agent workflows when path is explicitly scoped)
Output includes: header → file tree → files sorted by relevance (config → source → tests → docs)

2. Scoped Context (specific file types)

context-builder -d /path/to/project -f rs,toml -i docs,assets -y -o context.md

-f rs,toml includes only Rust and TOML files
-i docs,assets excludes directories by name

3. AST Signatures Mode (minimal tokens)

context-builder -d /path/to/project --signatures -f rs,ts,py -y -o signatures.md

Replaces full file content with extracted function/class signatures (~4K vs ~15K tokens per file)
Supports 8 languages: Rust, JavaScript (.js/.jsx), TypeScript (.ts/.tsx), Python, Go, Java, C, C++
Requires --features tree-sitter-all at install time

4. Signatures with Structural Summary

context-builder -d /path/to/project --signatures --structure -y -o context.md

--structure appends a count summary (e.g., "6 functions, 2 structs, 1 impl block")
Combine with --visibility public to show only public API surface

5. Budget-Constrained Context

context-builder -d /path/to/project --max-tokens 100000 -y -o context.md

Caps output to ~100K tokens (estimated)
Files are included in relevance order until budget is exhausted
Automatically warns if output exceeds 128K tokens

6. Token Count Preview

context-builder -d /path/to/project --token-count

Prints estimated token count without generating output
Use this first to decide if filtering or --signatures is needed

7. Incremental Diffs

First, ensure context-builder.toml exists with:

timestamped_output = true
auto_diff = true

Then run twice:

# First run: baseline snapshot
context-builder -d /path/to/project -y

# After code changes: generates diff annotations
context-builder -d /path/to/project -y

For minimal output (diffs only, no full file bodies):

context-builder -d /path/to/project -y --diff-only

Smart Defaults

These behaviors require no configuration:

Feature	Behavior
Auto-ignore	`node_modules`, `dist`, `build`, `__pycache__`, `.venv`, `vendor`, and 12 more heavy dirs are excluded at any depth
Self-exclusion	Output file, cache dir, and `context-builder.toml` are auto-excluded
.gitignore	Respected automatically when `.git` directory exists
Binary detection	Binary files are skipped via UTF-8 sniffing
File ordering	Config/docs first → source (entry points before helpers) → tests → build/CI → lockfiles

CLI Reference (Agent-Relevant Flags)

Flag	Purpose	Agent Guidance
`-d <PATH>`	Input directory	Always use absolute paths for reliability
`-o <FILE>`	Output path	Write to project `docs/` or `/tmp/`
`-f <EXT>`	Filter by extension	Comma-separated: `-f rs,toml,md`
`-i <NAME>`	Ignore dirs/files	Comma-separated: `-i tests,docs,assets`
`--max-tokens <N>`	Token budget cap	Use `100000` for most models, `200000` for Gemini
`--token-count`	Dry-run token estimate	Run first to check if filtering is needed
`-y`	Skip all prompts	Use only with explicit, scoped project paths
`--preview`	Show file tree only	Quick exploration without generating output
`--diff-only`	Output only diffs	Minimizes tokens for incremental updates
`--signatures`	AST signature extraction	Requires `tree-sitter-all` feature at install
`--structure`	Structural summary	Pair with `--signatures` for compact output
`--visibility <V>`	Filter by visibility	`all` (default), `public` (public API only)
`--truncate <MODE>`	Truncation strategy	`smart` (AST-aware) or `simple`
`--init`	Create config file	Auto-detects project file types
`--clear-cache`	Reset diff cache	Use if diff output seems stale

Recipes

Recipe: Deep Think Code Review

Generate a scoped context file, then prompt an LLM for deep analysis:

# Step 1: Generate focused context
context-builder -d /path/to/project -f rs,toml --max-tokens 120000 -y -o docs/deep_think_context.md

# Step 2: Feed to LLM with a review prompt
# Attach docs/deep_think_context.md and ask for:
# - Architecture review
# - Bug hunting
# - Performance analysis

Recipe: API Surface Review (signatures only)

# Extract only public signatures — typically 80-90% fewer tokens than full source
context-builder -d /path/to/project --signatures --visibility public -f rs -y -o docs/api_surface.md

Recipe: Compare Two Versions

# Generate context for both versions
context-builder -d ./v1 -f py -y -o /tmp/v1_context.md
context-builder -d ./v2 -f py -y -o /tmp/v2_context.md

# Feed both to an LLM for comparative analysis

Recipe: Monorepo Slice

# Focus on a specific package within a monorepo
context-builder -d /path/to/monorepo/packages/core -f ts,tsx -i __tests__,__mocks__ -y -o core_context.md

Recipe: Quick Size Check Before Deciding Strategy

# Check if the project fits in context
context-builder -d /path/to/project --token-count

# If > 128K tokens, try signatures mode first:
context-builder -d /path/to/project --signatures --token-count

# Or scope it down:
context-builder -d /path/to/project -f rs,toml --max-tokens 100000 --token-count

Configuration File (Optional)

Create context-builder.toml in the project root for persistent settings:

output = "docs/context.md"
output_folder = "docs"
filter = ["rs", "toml"]
ignore = ["target", "benches"]
timestamped_output = true
auto_diff = true
max_tokens = 120000
signatures = true
structure = true
visibility = "public"

Initialize one automatically with context-builder --init.

Output Format

The generated markdown follows this structure:

# Directory Structure Report
[metadata: project name, filters, content hash]

## File Tree
[visual tree of included files]

## Files
### File: src/main.rs
[code block with file contents, syntax-highlighted by extension]

### File: src/lib.rs
...

Files appear in relevance order (not alphabetical), prioritizing config and entry points so LLMs build understanding faster.

When --signatures is active, file contents are replaced with extracted signatures:

### File: src/lib.rs
```rust
pub fn run_with_args(args: Args, config: Config, prompter: &dyn Prompter) -> Result<()>
pub fn generate_markdown_with_diff(...) -> Result<String>
```

Error Handling

If context-builder is not installed, install with cargo install context-builder --features tree-sitter-all
If --signatures shows no output for a file, the language may not be supported or the feature was not enabled at install
If output exceeds token limits, add --max-tokens or narrow with -f / -i, or use --signatures
If the project has no .git directory, auto-ignores still protect against dependency flooding
Use --clear-cache if diff output seems stale or incorrect

Files

1 total

Select a file

Select a file to preview.

Comments

Loading comments…