TurboQuant Memory

v2.0.0

Compress and accelerate vector search in memory/RAG systems using TurboQuant (ICLR 2026) — near-optimal vector quantization with 5-8x compression and 98%+ se...

⭐ 0· 147·0 current·0 all-time

bySunnyZhou@sunnyztj

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for sunnyztj/turboquant-memory.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "TurboQuant Memory" (sunnyztj/turboquant-memory) from ClawHub.
Skill page: https://clawhub.ai/sunnyztj/turboquant-memory
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install turboquant-memory

ClawHub CLI

Package manager switcher

npx clawhub@latest install turboquant-memory

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Suspicious

medium confidence

✓

Purpose & Capability

Name, description, SKILL.md and the scripts all align: this is a local numpy-based implementation of a vector quantizer that detects SQLite embedding tables, quantizes embeddings, and writes results to a new 'quantized_embeddings' table. The required capabilities (none) are proportionate to the stated purpose.

Instruction Scope

The runtime instructions and scripts will scan arbitrary SQLite databases (auto-detect tables), read embeddings and text columns, and create/insert into a quantized_embeddings table using INSERT OR REPLACE. This is expected for a migration/quantize tool, but it means the skill will modify user databases. Also, SKILL.md and references disallow SRHT (lossy SRHT) while some code and imports reference SRHTRotate/SRHT — there is an inconsistency between the written docs and code which could cause unexpected behavior or indicate the code and docs are out of sync.

✓

Install Mechanism

There is no install spec (instruction-only with bundled scripts). No network downloads, no external package installs beyond numpy — low install risk.

✓

Credentials

The skill requests no environment variables or credentials. That is proportionate. However, the scripts require read/write access to whatever SQLite DB path you point them at; they will create/modify tables and can overwrite entries (INSERT OR REPLACE).

ℹ

Persistence & Privilege

The skill is not always:true and does not ask for persistent system privileges. It will persist data into the target SQLite DB (creates quantized_embeddings and writes records). That behavior is expected but impactful — backups are recommended before running migrations.

What to consider before installing

This skill mostly does what it says: local numpy-based quantization and migration of embeddings into a new SQLite table. Before running it on important data: (1) Back up any database you pass to migrate — the script creates a quantized_embeddings table and uses INSERT OR REPLACE which can overwrite rows. (2) Review the code locally: there are inconsistencies (SKILL.md/refs say blockwise Hadamard and warn against SRHT, but some modules/imports mention SRHTRotate or SRHT; some functions use type names like List/Dict without importing typing) — these may cause runtime errors or indicate the docs and code are out-of-sync. (3) Run the bundled tests (python3 scripts/turboquant.py) and validate on a small copy of your data (python3 scripts/validate.py --db /path/to/copy.db --auto-detect) to confirm behavior and metrics. (4) Prefer running migrate on a copied DB or staging environment. If you are not comfortable auditing Python code yourself, ask the author for clarification about the SRHT vs blockwise Hadamard mismatch and for guarantees about non-destructive migration and deterministic seeds.

Like a lobster shell, security has layers — review code before you run it.

latestvk97af7vp0y2nd7p9c4h18ys3rs83pwtb

147downloads

0stars

3versions

Updated 1mo ago

v2.0.0

MIT-0

TurboQuant Memory

Compress embedding vectors 5-8x with 98%+ search accuracy using TurboQuant (Google, ICLR 2026).

Quick Start

1. Run tests

python3 scripts/turboquant.py

15 built-in tests: FWHT correctness, MSE distortion, IP correlation, recall, compression ratio, determinism.

2. Validate on your data

python3 scripts/validate.py --db /path/to/memory.sqlite --auto-detect --bits 5

Auto-detects sqlite-vec vec0 tables, analyzes distribution, reports quantization quality and recall.

3. Quantize a memory database

python3 scripts/memory_quantize.py --db /path/to/memory.db --bits 5 --benchmark
python3 scripts/memory_quantize.py --db /path/to/memory.db --bits 5 --migrate

4. Integrate into code

from turboquant import TurboQuantMSE

# Initialize (deterministic — same seed = same quantization)
tq = TurboQuantMSE(dim=3072, bits=5)

# Quantize for storage
stored = tq.quantize(embedding_vector)  # float32 → compressed

# Reconstruct
reconstructed = tq.dequantize(stored)   # compressed → float32

# Search: query stays float32, database is quantized
q_rot = tq.rotation.apply(query)
for doc in database:
    score = doc['norm'] * doc['scale'] * np.dot(q_rot, tq.codebook[doc['indices']])

Recommended Configuration

Preset	Mode	Bits	R@1	Compression	Use Case
Default	MSE	5	98%	6.4x	Most memory/RAG search
Conservative	MSE	6	98%+	5.3x	High-fidelity retrieval
Aggressive	MSE	4	92%	8.0x	Large-scale, storage-constrained

Parameters

Parameter	Default	Description
`dim`	auto-detect	Embedding dimension (768, 1536, 3072, etc.)
`bits`	5	Bits per coordinate. See table above.
`seed`	42	Rotation seed. Same seed = reproducible quantization.

Algorithm

Blockwise Hadamard Rotation → Lloyd-Max Scalar Quantization

Split vector into power-of-2 blocks (e.g., 3072 = 3 × 1024)
Per block: random sign flip + Fast Walsh-Hadamard Transform (fully invertible)
Per-vector scale normalization
Lloyd-Max optimal scalar quantizer per coordinate (precomputed codebook for N(0,1))
Pack indices into compact bit representation

Key properties:

Data-oblivious: no training or calibration needed
Fully invertible: zero information loss from rotation
Near-optimal: within 2.7x of Shannon information-theoretic lower bound
Deterministic: same seed = same output

See references/algorithm.md for full details.

Benchmark (Gemini embedding-001, 3072-dim, 112 vectors)

Bits	MSE	Cosine	R@1	R@5	R@10	Bytes/vec	Compression
3	1.1e-5	0.982	88%	90%	91%	1,160	10.6x
4	3.2e-6	0.995	92%	93%	93%	1,544	8.0x
5	8.2e-7	0.999	98%	96%	96%	1,928	6.4x
6	2.2e-7	1.000	96%	98%	98%	2,312	5.3x
7	8e-8	1.000	100%	98%	99%	2,696	4.6x
8	3e-8	1.000	98%	98%	99%	3,080	4.0x

Compatibility

Python 3.9+, numpy only (no scipy, no GPU)
Any embedding dimension ≥ 128
Any embedding model (Gemini, OpenAI, Cohere, sentence-transformers, etc.)
SQLite / sqlite-vec vec0 tables (auto-detected)

References

TurboQuant paper: arXiv:2504.19874 (ICLR 2026)
PolarQuant paper: arXiv:2502.02617 (AISTATS 2026)

Comments

Loading comments...