AOMS - Always-On Memory Service

Other

Always-On Memory Service — persistent 4-tier memory (episodic, semantic, procedural, working) with weighted retrieval, vector search, progressive disclosure (L0/L1/L2), and automatic weight decay. Use when you need persistent agent memory across sessions, context recall for tasks, knowledge graphs, learning from mistakes, or any long-term memory capability. Replaces flat-file memory logging with a real indexed memory service. Works with OpenClaw, Claude Code, Codex, or any agent that can call HTTP APIs.

Install

openclaw skills install aoms

AOMS — Always-On Memory Service

Persistent memory service for AI agents. Stores experiences, facts, and skills in JSONL files with weighted retrieval and optional vector search via ChromaDB + Ollama embeddings.

Install & Start

# Install from PyPI
pip install cortex-mem

# Start (foreground)
cortex-mem start --port 9100

# Start (background daemon)
cortex-mem start --daemon

# Check status
cortex-mem status

# Docker alternative
docker pull ghcr.io/dhawalc/cortex-mem:latest
docker run -p 9100:9100 -v aoms-data:/app/modules ghcr.io/dhawalc/cortex-mem

The service runs on http://localhost:9100. API docs at /docs.

Note: AOMS runs as a local HTTP service on your machine. It does not send data externally. Vector search requires a local Ollama instance (optional).

Core Concepts

Memory Tiers:

Tier	Stores	Example
`episodic`	Experiences, decisions, failures	"Deployed v2 — rollback needed due to missing migration"
`semantic`	Facts, relations, knowledge	"Project uses pnpm, not npm"
`procedural`	Skills, patterns, workflows	"To deploy: run migrations first, then build, then push"

Weighted Retrieval: Every entry has a weight (0.1–5.0). Important memories surface first. Weights increase when memories prove useful (/memory/weight) and decay over time (/memory/decay).

Progressive Disclosure (Cortex): Large documents are stored at 3 tiers — L0 (one-liner), L1 (summary), L2 (full text). Queries auto-escalate within a token budget.

API Quick Reference

Write Memory

curl -X POST http://localhost:9100/memory/episodic \
  -H "Content-Type: application/json" \
  -d '{
    "type": "experience",
    "payload": {
      "title": "Fixed auth bug",
      "outcome": "Token refresh was missing retry logic",
      "tags": ["auth", "bugfix"]
    },
    "weight": 1.3
  }'

Search Memory

# Keyword search
curl -X POST http://localhost:9100/memory/search \
  -H "Content-Type: application/json" \
  -d '{"query": "deployment", "limit": 5}'

# Filter by tier
curl -X POST http://localhost:9100/memory/search \
  -d '{"query": "auth", "tier": ["episodic", "procedural"], "limit": 10}'

Agent Recall (context injection)

Single endpoint to get relevant context for a task, formatted for prompt injection:

curl -X POST http://localhost:9100/recall \
  -H "Content-Type: application/json" \
  -d '{"task": "deploy the API", "token_budget": 500, "format": "markdown"}'

Returns pre-formatted context with tier headers. Inject directly into agent prompts.

Reinforce Memory

When a memory proves useful, boost its weight:

curl -X POST http://localhost:9100/memory/weight \
  -d '{"entry_id": "abc123", "tier": "episodic", "task_score": 0.9}'

Cortex Query (progressive disclosure)

curl -X POST http://localhost:9100/cortex/query \
  -d '{"query": "deployment process", "token_budget": 1000, "top_k": 3}'

Auto-escalates from L0 → L1 → L2 within the token budget.

Agent Integration Patterns

Pattern 1: Session Boot (recall context)

At session start, call /recall with the current task to inject relevant memory:

import httpx

resp = httpx.post("http://localhost:9100/recall", json={
    "task": "working on auth module",
    "token_budget": 500,
    "format": "markdown"
})
context = resp.json()["context"]
# Inject into system prompt or prepend to conversation

Pattern 2: Log Learnings (write on events)

After completing a task, fixing a bug, or learning something new:

httpx.post("http://localhost:9100/memory/episodic", json={
    "type": "experience",
    "payload": {
        "title": "pnpm not npm",
        "outcome": "Project uses pnpm workspaces. npm install fails.",
        "tags": ["build", "correction"]
    },
    "weight": 1.5
})

Pattern 3: Knowledge Graph (semantic facts)

Store structured facts as subject-predicate-object triples:

httpx.post("http://localhost:9100/memory/semantic", json={
    "type": "relation",
    "payload": {
        "subject": "auth-service",
        "predicate": "depends_on",
        "object": "redis",
        "confidence": 0.95
    }
})

Pattern 4: Reinforce on Success

After using a recalled memory successfully, boost its weight:

httpx.post("http://localhost:9100/memory/weight", json={
    "entry_id": recalled_id,
    "tier": "episodic",
    "task_score": 0.9  # >0.5 boosts, <0.5 decays
})

OpenClaw Integration

To use AOMS with OpenClaw, configure it manually:

1. Add to OpenClaw config

# In ~/.openclaw/config.yaml
memory:
  provider: cortex-mem
  url: http://localhost:9100

2. Session boot script

Add a boot script to your workspace (see references/openclaw-setup.md for a full example):

# boot_aoms.py — call at session start
import httpx, sys
try:
    r = httpx.post("http://localhost:9100/recall", json={
        "task": "session boot — what's recent and relevant",
        "token_budget": 300, "format": "markdown"
    }, timeout=5.0)
    if r.status_code == 200:
        print(r.json()["context"])
except Exception as e:
    print(f"AOMS unavailable: {e}", file=sys.stderr)

3. Optional: Workspace migration

If you have existing flat-file memory (MEMORY.md, daily logs), you can import it:

cortex-mem migrate ~/.openclaw/workspace

This is optional and explicit. Review what files will be parsed before running. The command reads Markdown files and creates structured memory entries — it does not modify or delete originals.

Helper Functions

from openclaw_integration import log_achievement, log_error, log_fact

await log_achievement("Shipped v2", "All tests passing, deployed to prod")
await log_error("Build failed", "Missing dependency: libpq-dev")
await log_fact("project", "uses", "PostgreSQL 16")

Maintenance

# Weight decay (old memories fade unless reinforced)
curl -X POST http://localhost:9100/memory/decay \
  -d '{"min_age_days": 30, "decay_rate": 0.995, "dry_run": true}'

# Consolidate similar memories
curl -X POST http://localhost:9100/memory/consolidate \
  -d '{"tier": "episodic", "min_age_days": 30, "dry_run": true}'

# Deduplication
curl -X POST http://localhost:9100/memory/deduplicate?tier=episodic&dry_run=true

# Stats
curl http://localhost:9100/stats

Full API Reference

See references/api-reference.md for all endpoints, request/response schemas, and advanced features (vector search, entity extraction, document ingestion).

Configuration

Default config is at service/config.yaml. Key settings:

service:
  port: 9100          # API port
  host: localhost      # Bind address (use 0.0.0.0 for Docker)

storage:
  root: .              # Where JSONL module files live

weights:
  decay_rate: 0.995    # Daily decay multiplier
  min_weight: 0.1      # Floor
  max_weight: 5.0      # Ceiling

Set CORTEX_MEM_ROOT env var to override the storage root.