Meaningful Chunker

v1.2.2

Graph-based code intelligence API. Query any indexed codebase for architecture understanding, debugging, refactor safety analysis, and design principle mappi...

⭐ 1· 127·0 current·0 all-time

by@daymyandogg

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for daymyandogg/meaningful-chunker.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "Meaningful Chunker" (daymyandogg/meaningful-chunker) from ClawHub.
Skill page: https://clawhub.ai/daymyandogg/meaningful-chunker
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Required env vars: CHUNKER_API_URL, CHUNKER_API_KEY
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install meaningful-chunker

ClawHub CLI

Package manager switcher

npx clawhub@latest install meaningful-chunker

Security Scan

Capability signals

CryptoCan make purchasesRequires sensitive credentials

These labels describe what authority the skill may exercise. They are separate from suspicious or malicious moderation verdicts.

VirusTotal

Benign

View report →

OpenClaw

Benign

high confidence

✓

Purpose & Capability

Name/description match the declared requirements: it is an API-backed code-intelligence tool and it only asks for a base URL and API key (CHUNKER_API_URL, CHUNKER_API_KEY). No unrelated binaries, env vars, or installs are requested.

ℹ

Instruction Scope

Instructions are focused on scanning a repo and querying the API. Important operational detail: POST /scan clones or uploads repositories (or accepts a local project_path for self-hosted instances), which effectively transmits repository contents to the remote service. This is expected for the stated purpose but is a privacy/data-exfiltration consideration you must accept before scanning private code.

✓

Install Mechanism

No install spec and no code files — instruction-only skill. Nothing will be written to disk by the skill itself during install, which reduces supply-chain risk.

ℹ

Credentials

Only two env vars are required (API base URL and API key); this is proportionate. CHUNKER_API_KEY is the primary credential and grants the ability to scan and query codebases on the remote host, so treat it as highly sensitive.

✓

Persistence & Privilege

Skill is not always-enabled and uses normal model-invocation defaults. It does not request or imply modification of other skills or system-wide configuration.

Assessment

This skill appears internally consistent with a hosted code-analysis service. Before installing or using it: (1) Do not upload private/proprietary code to the public hosted endpoint unless you trust the operator or have a written agreement/privacy policy; scanning a repo or local path transmits code to CHUNKER_API_URL. (2) Prefer using a self-hosted instance if you need to analyze sensitive code. (3) Treat CHUNKER_API_KEY as a secret — use least-privilege or ephemeral keys if available and rotate them regularly. (4) Verify the CHUNKER_API_URL and registration endpoints (the README references a railway.app host) before entering keys, and test with non-sensitive public repos first. (5) Review the service's data retention and deletion policy so cloned/processed code is not stored longer than you expect.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

🧠 Clawdis

EnvCHUNKER_API_URL, CHUNKER_API_KEY

Primary envCHUNKER_API_KEY

aivk976hpe5hhnt6nhq8kkedt6nzd854zzrarchitecturevk976hpe5hhnt6nhq8kkedt6nzd854zzrcode-analysisvk976hpe5hhnt6nhq8kkedt6nzd854zzrdebuggingvk976hpe5hhnt6nhq8kkedt6nzd854zzrdeveloper-toolsvk976hpe5hhnt6nhq8kkedt6nzd854zzrlatestvk976hpe5hhnt6nhq8kkedt6nzd854zzr

127downloads

1stars

3versions

Updated 1w ago

v1.2.2

MIT-0

Meaningful Chunker — Code Intelligence API

A graph-based code analysis system that scans any codebase once, builds a semantic graph of all components and their relationships, then answers natural-language queries against that graph. Returns structured, ranked results with explanation — not raw file dumps.

⚠️ Required First Step

You MUST scan a codebase before using any query endpoints.

If you skip this, all queries will fail with a no_scan_loaded error.

Always start with POST /scan. See the Scanning a Codebase section below.

🔐 Authentication Required

All endpoints except /health and /status require an API key.

Include this header in every request:

x-api-key: YOUR_API_KEY

Without this header, requests return 401 Unauthorized. Set your key as CHUNKER_API_KEY in your agent's environment.

Setup

Two environment variables are required before using this skill:

CHUNKER_API_URL   — Base URL of the hosted API.
                    Example: https://meaningful-chunker-production.up.railway.app
CHUNKER_API_KEY   — Your API key. Get a free key instantly at:
                    https://meaningful-chunker-production.up.railway.app/register/free
                    (100 queries/month, resets each calendar month, no credit card required)
                    Upgrade to Pro (2,000/month) at /upgrade.

Full OpenAPI spec available at: $CHUNKER_API_URL/docs

When to Use This Skill

Use one of the four query endpoints depending on intent:

Intent	Endpoint	Example query
Understand what something does	`/query/architecture`	"What does the authentication module do?"
Trace a bug or failure	`/query/debug`	"Why is the login function failing?"
Assess safety of a change	`/query/refactor`	"Can I safely change the database connector?"
Understand design principles	`/query/philosophy`	"What principle governs error handling here?"

All four endpoints accept the same request body and return the same response shape.

🚀 30-Second Quick Start

Scan a repo: POST /scan with {"repo_url": "https://github.com/owner/repo"}
Wait until: GET /status → "ready": true
Query: POST /query/architecture with {"query": "What does X do?"}

Scanning a Codebase

Scan a GitHub repo (recommended)

curl -s -X POST $CHUNKER_API_URL/scan \
  -H "Content-Type: application/json" \
  -H "x-api-key: $CHUNKER_API_KEY" \
  -d '{"repo_url": "https://github.com/owner/repo"}'

The system clones the repo, builds the graph, and deletes the local clone automatically. Supports any public GitHub, GitLab, or git URL — including /tree/main branch URLs. Repos larger than 300MB are rejected with a clear error.

Scan a local path (self-hosted instances only)

curl -s -X POST $CHUNKER_API_URL/scan \
  -H "Content-Type: application/json" \
  -H "x-api-key: $CHUNKER_API_KEY" \
  -d '{"project_path": "/path/to/project"}'

Check scan progress

curl $CHUNKER_API_URL/status

When "ready": true the graph is built and all query endpoints are available. Scanning typically takes 5–30 seconds depending on repo size.

How to Query

Request format

curl -s -X POST $CHUNKER_API_URL/query/architecture \
  -H "Content-Type: application/json" \
  -H "x-api-key: $CHUNKER_API_KEY" \
  -d '{"query": "What does the payment processor do?"}'

Replace /query/architecture with the appropriate endpoint for your intent.

Optional: session continuity

Pass a session_id to maintain context across related queries. The system remembers what you asked recently and biases results toward the same area of the codebase.

curl -s -X POST $CHUNKER_API_URL/query/debug \
  -H "Content-Type: application/json" \
  -H "x-api-key: $CHUNKER_API_KEY" \
  -d '{"query": "Why is the checkout flow failing?", "session_id": "my-investigation-001"}'

⚡ Quick Read Guide

For fastest results, read the response in this order:

answer_summary — one sentence telling you what the key component is and why it matters. Start here every time.
primary_path — the execution chain (A → B → C). Shows how things connect structurally.
CENTER tier chunks — the exact matches. These ARE the answer.
EPIPHANY tier chunks — critical cross-system connections. Don't skip these.
next_step — actionable follow-up already computed for you.

Everything else (BREAKTHROUGH, RELEVANT) is supporting context. Read it when you need depth, skip it when you don't.

Understanding the Response

Top-level synthesis fields

answer_summary      — One-sentence synthesis. Start here. Tells you what the
                      key component is, its role, and why it matters.

system_explanation  — What the relevant subsystem does as a whole. Multiple
                      components explained together.

primary_path        — The execution chain: "A → B → C". Shows how components
                      connect structurally. Most useful for debugging.

query_profile       — Which intent was detected (architecture/debug/refactor_risk/
                      philosophy). Confirms the system understood your query.

confidence          — "high" / "medium" / "low". Trust signal before reading context.

next_step           — Actionable follow-up. What to look at next.

Debug-specific fields (present on `/query/debug` responses)

root_cause_analysis.execution_timeline    — Call chain with [FAILING_CHUNK] bracketed
root_cause_analysis.root_cause_hypotheses — Ranked suspects with confidence + check instructions
root_cause_analysis.top_suspect           — Single highest-suspicion chunk with specific check

Refactor-specific fields (present on `/query/refactor` responses)

structural_authority.change_risk   — "untouchable" / "sensitive" / "local"
structural_authority.blast_radius  — How many chunks/files break if this changes
top_risks                          — Top 5 most dangerous chunks in the whole codebase

Context tiers

CENTER       (score=100)  — Exact match. This IS what you asked about.
EPIPHANY     (score≥99)   — Critical cross-component connection. Don't skip these.
BREAKTHROUGH (score≥82)   — Direct structural dependency. Important context.
RELEVANT     (score≥70)   — Meaningful but peripheral. Useful for broader understanding.

Each chunk includes:

name — component identifier
type — class / function / method / module_code / etc.
file — source file path
summary — one-line description
why_matched — how the system found this chunk
neighbors — adjacent components in the graph
explanation (CENTER/EPIPHANY only) — roles, primary reason, importance summary

Example: Architecture Query

curl -s -X POST $CHUNKER_API_URL/query/architecture \
  -H "Content-Type: application/json" \
  -H "x-api-key: $CHUNKER_API_KEY" \
  -d '{"query": "What does the UserAuthenticator do?"}'

Read: answer_summary → primary_path → CENTER explanation.roles → BREAKTHROUGH neighbors

Example: Debug Query

curl -s -X POST $CHUNKER_API_URL/query/debug \
  -H "Content-Type: application/json" \
  -H "x-api-key: $CHUNKER_API_KEY" \
  -d '{"query": "Why is the login flow not working?"}'

Read: root_cause_analysis.top_suspect → execution_timeline → root_cause_hypotheses[0]

Example: Refactor Safety Query

curl -s -X POST $CHUNKER_API_URL/query/refactor \
  -H "Content-Type: application/json" \
  -H "x-api-key: $CHUNKER_API_KEY" \
  -d '{"query": "Can I safely change the database connection handler?"}'

Read: structural_authority.change_risk → blast_radius → advice → top_risks

Relevance Tiers — Scoring Reference

Tier	Score	What it means
CENTER	100	Exact match — this IS the answer
EPIPHANY	≥99	Critical cross-system connection
BREAKTHROUGH	≥82	Direct structural dependency
RELEVANT	≥70	Meaningful peripheral context

Results capped at: CENTER×3, EPIPHANY×5, BREAKTHROUGH×6, RELEVANT×5. Max 19 chunks per response.

Checking API Health

These endpoints are always open — no API key required:

curl $CHUNKER_API_URL/health

Returns {"status": "ok"} when the system is up and a project is indexed.

curl $CHUNKER_API_URL/status

Returns current graph stats: chunk count, edge count, cluster count, shortcut count, last scan time.

⚠️ Common Issues

"no_scan_loaded" → You forgot to run /scan
401 Unauthorized → Missing x-api-key header

Notes

Queries are natural language — no special syntax required. "Why is X broken?" works as well as "explain the architecture of X."
The system is language-agnostic: Python, C++, JSON, Markdown, and plain text files are all indexed.
Session memory persists within an API session (same session_id). Cross-session long-term focus is tracked via insight memory — frequently queried components surface higher automatically.
Repo scans maintain memory during the active service lifecycle. The same repo URL will accumulate focus signals across queries until the service restarts.
Large repos (10k+ files) may take up to 2 minutes to scan. Poll /status to check progress.

Comments

Loading comments...