MidOS Memory Cascade

Automation

Auto-escalating multi-tier memory search that cascades from in-memory cache through SQLite, grep, and LanceDB vector search to find the best answer with minimal latency.

Install

openclaw skills install midos-memory-cascade

MidOS Memory Cascade

A self-tuning, auto-escalating search engine that tries each memory tier from fastest to slowest, stopping as soon as it finds a high-confidence answer.

What It Does

Instead of the agent deciding which storage layer to query, the cascade tries each tier automatically:

TierStorageLatencyStrategy
T0In-memory session cache<1msExact + fuzzy key match
T1JSON state files<5msFilename + key match
T2SQLite (pipeline_synergy.db)<5msStructured SQL LIKE
T3SQLite FTS5<1msFull-text keyword on 22K rows
T4Grep over 46K chunks~3sBrute-force ripgrep fallback
T5LanceDB keyword (BM25)slow670K vector rows, no embeddings
T5bLanceDB semantic3–30sEmbedding similarity, last resort

Question routing: Queries starting with how/what/why/etc. skip keyword tiers and route directly to semantic search.

Self-learning: The cascade records which tier resolves each query. After enough history, evolve() learns shortcuts (skip directly to the winning tier) and marks consistently-empty tiers for skip.

Usage

Python API

from tools.memory.memory_cascade import recall, store

# Search across all tiers
result = recall("adaptive alpha reranking")
# → {"answer": {...}, "tier": "T5:lancedb", "latency_ms": 340, "confidence": 0.87}

# Write to the right storage automatically
store("pattern", content="...", tags=["ml", "reranking"])

CLI

# Search
python memory_cascade.py recall "query here"

# View tier resolution stats
python memory_cascade.py stats

# Run self-evolution (learn shortcuts + tier skips)
python memory_cascade.py evolve

recall() Options

recall(
    query: str,
    min_confidence: float = 0.5,  # stop escalating at this threshold
    max_tier: int = 6             # 0=T0 only, 6=all tiers
)

Returns:

{
  "answer": { "source": "...", "text": "..." },
  "confidence": 0.87,
  "latency_ms": 340.2,
  "tiers_tried": 3,
  "resolved_at": "T5:lancedb",
  "shortcut": null,
  "question_routed": false,
  "escalation": [...]
}

Requirements

  • Python 3.10+ (stdlib only for core cascade logic)
  • Optional: hive_commons for LanceDB tiers (T5/T5b)
  • Optional: tools.memory.memory_router for store() routing

The cascade degrades gracefully — if LanceDB is unavailable, it stops at grep (T4). All stdlib tiers (T0–T4) work with zero dependencies.

Architecture Notes

  • Thread-safe: Session cache uses threading.Lock; stats writes use separate locks
  • Cross-process safe: JSONL writes use OS-level file locking (msvcrt on Windows, fcntl on Unix)
  • Confidence scoring: Term overlap × score × content richness → normalized 0–1
  • Stats persistence: knowledge/SYSTEM/cascade_stats.json accumulates hit rates per tier

Built with MidOS. 1 of 200+ skills. Full ecosystem at midos.dev/pro