Install
openclaw skills install turboquant-memoryCompress and accelerate vector search in memory/RAG systems using TurboQuant (ICLR 2026) — near-optimal vector quantization with 5-8x compression and 98%+ search accuracy. Uses blockwise Hadamard rotation + Lloyd-Max scalar quantization. Use when: (1) optimizing embedding storage size, (2) speeding up semantic search, (3) user mentions "compress embeddings", "quantize vectors", "memory optimization", "faster search", "TurboQuant", "vector compression", or "embedding compression", (4) reducing memory footprint of RAG systems. Works with any embedding model (Gemini, OpenAI, Cohere, local) and any dimension ≥ 128. No GPU required. numpy only.
openclaw skills install turboquant-memoryCompress embedding vectors 5-8x with 98%+ search accuracy using TurboQuant (Google, ICLR 2026).
python3 scripts/turboquant.py
15 built-in tests: FWHT correctness, MSE distortion, IP correlation, recall, compression ratio, determinism.
python3 scripts/validate.py --db /path/to/memory.sqlite --auto-detect --bits 5
Auto-detects sqlite-vec vec0 tables, analyzes distribution, reports quantization quality and recall.
python3 scripts/memory_quantize.py --db /path/to/memory.db --bits 5 --benchmark
python3 scripts/memory_quantize.py --db /path/to/memory.db --bits 5 --migrate
from turboquant import TurboQuantMSE
# Initialize (deterministic — same seed = same quantization)
tq = TurboQuantMSE(dim=3072, bits=5)
# Quantize for storage
stored = tq.quantize(embedding_vector) # float32 → compressed
# Reconstruct
reconstructed = tq.dequantize(stored) # compressed → float32
# Search: query stays float32, database is quantized
q_rot = tq.rotation.apply(query)
for doc in database:
score = doc['norm'] * doc['scale'] * np.dot(q_rot, tq.codebook[doc['indices']])
| Preset | Mode | Bits | R@1 | Compression | Use Case |
|---|---|---|---|---|---|
| Default | MSE | 5 | 98% | 6.4x | Most memory/RAG search |
| Conservative | MSE | 6 | 98%+ | 5.3x | High-fidelity retrieval |
| Aggressive | MSE | 4 | 92% | 8.0x | Large-scale, storage-constrained |
| Parameter | Default | Description |
|---|---|---|
dim | auto-detect | Embedding dimension (768, 1536, 3072, etc.) |
bits | 5 | Bits per coordinate. See table above. |
seed | 42 | Rotation seed. Same seed = reproducible quantization. |
Blockwise Hadamard Rotation → Lloyd-Max Scalar Quantization
Key properties:
See references/algorithm.md for full details.
| Bits | MSE | Cosine | R@1 | R@5 | R@10 | Bytes/vec | Compression |
|---|---|---|---|---|---|---|---|
| 3 | 1.1e-5 | 0.982 | 88% | 90% | 91% | 1,160 | 10.6x |
| 4 | 3.2e-6 | 0.995 | 92% | 93% | 93% | 1,544 | 8.0x |
| 5 | 8.2e-7 | 0.999 | 98% | 96% | 96% | 1,928 | 6.4x |
| 6 | 2.2e-7 | 1.000 | 96% | 98% | 98% | 2,312 | 5.3x |
| 7 | 8e-8 | 1.000 | 100% | 98% | 99% | 2,696 | 4.6x |
| 8 | 3e-8 | 1.000 | 98% | 98% | 99% | 3,080 | 4.0x |
vec0 tables (auto-detected)