Install
openclaw skills install ultramemoryStructured AI agent memory with temporal versioning, relational tracking, and semantic search. Use when storing facts, recalling context, searching past conversations, tracking how knowledge changed over time, or building entity profiles. Replaces flat MEMORY.md with atomic fact extraction, update/contradict/extend relations, and hybrid semantic+temporal search. Use for any "remember this", "what do I know about X", "when did this change", or cross-session knowledge retrieval.
openclaw skills install ultramemoryStructured agent memory: extracts atomic facts from text, detects relations to existing knowledge (updates, contradicts, extends), embeds for semantic search, and auto-builds entity profiles.
PyPI: ultramemory | GitHub: jared-goering/ultramemory
# Install (creates venv automatically on first run)
pip install ultramemory
# Or from source
git clone https://github.com/jared-goering/ultramemory.git
cd ultramemory && pip install -e .
Requirements:
Environment:
export ANTHROPIC_API_KEY="sk-ant-..." # or OPENAI_API_KEY
export ULTRAMEMORY_DB="./memory.db" # default: memory.db in current dir
from ultramemory import MemoryEngine
engine = MemoryEngine(db_path="memory.db")
# Ingest text (extracts atomic facts, detects relations, builds profiles)
results = engine.ingest("Jared moved from Bel Aire to Wichita. The baby is due in July.")
# Search
matches = engine.search("Where does Jared live?", top_k=5)
# Recall (compact context block for agent prompts)
context = engine.recall("current projects and priorities", top_k=5)
The scripts/memory.sh wrapper handles venv activation and API key loading.
bash scripts/memory.sh ingest \
"Jared moved to Wichita. The baby is now due in August, not July." \
--session "main-2026-03-22" --agent kit
Categories: person, preference, project, decision, event, insight
Relations auto-detected: updates (supersedes old fact), contradicts, extends, supports, derives
When a memory updates an existing one, the old memory is marked superseded and the new one gets an incremented version.
Compact context block for injecting into agent prompts:
bash scripts/memory.sh recall "Where does Jared live?" --top-k 5
Output:
[person] Jared moved to Wichita. (v2, current, 89% match)
-> updates: Jared lives in Bel Aire, KS.
[project] Jared teaches Human-Centric Design at WSU. (v1, current, 72% match)
bash scripts/memory.sh search "baby due date" --top-k 10
# Include superseded memories:
bash scripts/memory.sh search "baby due date" --all
# Time travel: what did we know on March 1?
bash scripts/memory.sh search "baby due date" --as-of 2026-03-01
bash scripts/memory.sh entities # List all known entities
bash scripts/memory.sh history Jared # Version timeline
bash scripts/memory.sh profile Jared # Auto-built profile
bash scripts/memory.sh stats # Counts, categories
At the start of any session, hydrate context:
bash scripts/startup-recall.sh <agent-id>
After meaningful conversations, pass the text directly:
bash scripts/memory.sh ingest "User decided to use React for the frontend. Budget is $50k." --session $SESSION_KEY --agent $AGENT_ID
For multi-agent setups, run the API server (requires separate install from PyPI):
pip install ultramemory
python3 -m uvicorn ultramemory.server:app --port 8642 --host 127.0.0.1
Endpoints: POST /api/ingest, POST /api/search, POST /api/recall, GET /api/stats
For continuous ingestion from session files, see the GitHub repo which includes auto_ingest.py and live_ingest.sh scripts.
You'll outgrow a flat file fast. But you also can't replace it entirely. We tried.
At 18,000+ memories, search results get noisy. The DB is great at answering "what happened Tuesday?" but terrible as a session primer. Meanwhile, MEMORY.md is perfect for "who am I, who's my human, what are we working on" but can't hold 18K facts in 2K tokens.
The architecture we landed on uses three layers:
Layer 1: MEMORY.md (always loaded, zero cost) Curated essentials under 2K tokens. Loaded every session, no API calls, no latency. Contains identity, active projects, key preferences. Think of it as working memory.
Layer 2: Ultramemory plugin (opportunistic injection) When a message arrives, the plugin searches the DB and injects relevant memories if they score above a similarity threshold (we use 0.55). The agent never explicitly asks for this. It just gets richer context when the DB has something relevant.
Layer 3: Ultramemory direct (precision recall) The agent explicitly searches when it needs specifics. "What was the benchmark result?" or "When did we decide to drop NYC?" This is the full 18K+ memory DB with semantic search, temporal filtering, and entity profiles.
MEMORY.md is the backup and the bootstrap. Ultramemory is the brain. You need both.
80% accuracy on LongMemEval_s (production-relevant questions). 32ms median search latency.