SEEM

v0.1.0

Advanced episodic memory system for multi-turn conversations. Store and retrieve structured conversation memories with fact graph, PPR retrieval, and three r...

⭐ 0· 74·0 current·0 all-time

by@ryantoleco

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for ryantoleco/seem-skill.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "SEEM" (ryantoleco/seem-skill) from ClawHub.
Skill page: https://clawhub.ai/ryantoleco/seem-skill
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Required env vars: LLM_API_KEY, MM_ENCODER_API_KEY
Required binaries: python3, pip
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install seem-skill

ClawHub CLI

Package manager switcher

npx clawhub@latest install seem-skill

Security Scan

VirusTotal

Suspicious

View report →

OpenClaw

Benign

medium confidence

✓

Purpose & Capability

Name/description (episodic memory, retrieval, embeddings) align with what the code asks for: python, pip, an LLM API key and an embedding API key. The required env vars (LLM_API_KEY, MM_ENCODER_API_KEY) and the Python modules listed in requirements.txt (openai, numpy, networkx, rank-bm25, etc.) are appropriate for the described functionality. Note: default base_url values in the config point at third-party endpoints (e.g., api.deepseek.com, api.siliconflow.cn); these are configurable but should be verified.

ℹ

Instruction Scope

The SKILL.md and scripts direct the agent to send conversation text (and optionally images) to external LLM and embedding services and to run local CLI scripts that read/write the skill's local data. The CLI falls back to reading a local config.py if env vars are not set. There are no instructions to read unrelated system files or to transmit data to unexpected endpoints beyond the configured LLM/embed base_urls, but you should assume all stored conversation data and images will be transmitted to whatever endpoints are configured.

ℹ

Install Mechanism

This skill is delivered with source files and a requirements.txt but has no automated install spec. That means installing/running it will typically require pip installing the listed packages. Dependencies are common for this domain and there are no obvious remote-download-or-extract steps in the manifest. Still, because source was published with no homepage and unknown owner, install from a controlled environment and inspect dependencies before pip installing.

✓

Credentials

The skill requests two API keys that match its needs: an LLM key (primary) and an embeddings/MM encoder key. There are no unrelated credentials requested. Caveat: these API keys (and configured base_url values) allow the skill to send all conversation and image data to the remote LLM/embed providers you supply—ensure those providers are trusted and that keys are not reused for other sensitive services.

ℹ

Persistence & Privilege

The skill persists memories and related indexes to disk (save/load logic referenced and CLI utilities create a data directory). always:false (no forced global inclusion) and disable-model-invocation:false (normal: agent or model can call the skill). This persistence is expected for a memory skill, but be aware stored data lives on the agent host and will be reloaded across runs when persistence is enabled.

Assessment

This skill appears internally consistent for a memory/retrieval system, but take these precautions before installing: - Verify and set the LLM/embedding base URLs to services you trust. Default config points at third-party domains (e.g., api.deepseek.com, api.siliconflow.cn); those endpoints will receive any conversation text and images. - Treat LLM_API_KEY and MM_ENCODER_API_KEY as sensitive secrets. Do not reuse them for unrelated accounts and prefer scoped/test keys. - Understand the skill persists memories to disk (under the skill directory). If you do not want local persistence, disable caching/persistence in the config (enable_cache) or inspect/modify save/load methods before use. - Because the package author and homepage are unknown, review network egress, the code paths that call the LLM/embedding APIs, and any save/load code before deploying in production. - If you need higher assurance, request provenance (author, repo, release tag) or run the skill in an isolated sandbox to observe network traffic and file writes. Confidence is medium because the implementation is coherent but the source is unpublished/unknown and the default endpoints are third-party; verifying endpoints and provenance would increase confidence.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

🧠 Clawdis

Binspython3, pip

EnvLLM_API_KEY, MM_ENCODER_API_KEY

Primary envLLM_API_KEY

latestvk97bwqh0eqxshpj9v7q16nwhp183ym0r

74downloads

0stars

1versions

Updated 4w ago

v0.1.0

MIT-0

SEEM Skill

Structured Episodic & Entity Memory for multi-turn conversations.

Quick Start

from seem_skill import SEEMSkill, SEEMConfig, RecallMode

config = SEEMConfig()
skill = SEEMSkill(config)

# Store conversation
memory_id = skill.store({
    "text": "Lena asked about Scottish Terriers",
    "speaker": "Alice"
})

# Recall (default: LITE mode — facts + episodic memory, no raw chunks)
result = skill.recall({"text": "What did Lena ask?"}, top_k=3)
# result = {"memories": [...], "facts": [...]}

# Recall with raw chunks
result = skill.recall({"text": "What did Lena ask?"}, mode=RecallMode.PRO)

# Recall with backfill
result = skill.recall({"text": "What did Lena ask?"}, mode=RecallMode.MAX)

Recall Modes

Mode	Facts	Episodic Memory	Raw Chunks	Backfill
Lite (default)	✅	✅ (summary + events)	❌	❌
Pro	✅	✅	✅ (top_k)	❌
Max	✅	✅	✅ (top_k + backfill ≤ 2×top_k)	✅

Lite: Lightest context. Facts + structured memory only. Best for LLM agents that want concise context.
Pro: Includes raw observation text for the top_k retrieved chunks.
Max: Full context with backfill from associated memories (up to 2×top_k chunks).

Retrieval Strategies

Strategy	Method	Best For
DPR	Dense vector similarity	Simple keyword-matching queries
Hybrid RRF	Dense + BM25 sparse fusion	Mixed keyword + semantic queries
PPR	Personalized PageRank over knowledge graph	Multi-hop, entity-rich queries

Default strategy is configured in config.py (currently ppr).

Configuration

Environment Variables (Recommended)

export LLM_API_KEY="sk-xxx"
export LLM_BASE_URL="https://api.deepseek.com"
export LLM_MODEL="deepseek-chat"

export MM_ENCODER_API_KEY="sk-xxx"
export MM_ENCODER_BASE_URL="https://api.siliconflow.cn/v1"
export MM_ENCODER_MODEL="Qwen/Qwen3-Embedding-8B"

Unified Configuration File

All default settings are centralized in seem_skill/config.py:

LLM_CONFIG = {
    "base_url": "https://api.deepseek.com",
    "model": "deepseek-chat",
}

EMBEDDING_CONFIG = {
    "base_url": "https://api.siliconflow.cn/v1",
    "model": "Qwen/Qwen3-Embedding-8B",
}

Custom Configuration

Override defaults programmatically:

config = SEEMConfig(
    llm_api_key="your-key",
    llm_model="custom-model",
    retrieve_strategy=RetrieveStrategy.PPR,
    top_k_facts=10,
    ppr_damping=0.6,
)

Key Parameters

Parameter	Default	Description
`retrieve_strategy`	`hybrid_rrf`	DPR / Hybrid RRF / PPR
`top_k_chunks`	3	Number of chunks to retrieve
`top_k_facts`	5	Number of fact triples to retrieve
`top_k_candidates`	3	Integration candidate count
`rrf_rank_constant`	30	RRF smoothing constant
`ppr_damping`	0.5	PPR teleport probability
`backfill_chunks`	5	Max additional chunks per backfill
`enable_fact_graph`	True	Build fact graph on store
`entity_similarity_threshold`	0.9	Entity linking threshold
`enable_integration`	True	Dynamic memory integration
`integration_window`	3	Batch size for deferred integration

Operations

Store

python scripts/cli_memory.py store --text "Your message" --speaker user
python scripts/cli_memory.py store --dialogue-id "D1:1" --speaker "Alice" --text "Message"

Recall

python scripts/cli_memory.py recall --query "Your query" --mode lite
python scripts/cli_memory.py recall --query "Your query" --mode pro --strategy ppr --top-k 5
python scripts/cli_memory.py recall --query "Your query" --mode max --top-k-facts 10

Facts (Knowledge Graph)

python scripts/cli_memory.py facts               # Show all fact triples
python scripts/cli_memory.py facts --entity 小米   # Filter by entity

Display (Detailed)

python scripts/cli_memory.py display
python scripts/cli_memory.py display --dialogue-id "D1:1"

View (Compact 5W1H)

python scripts/cli_memory.py view

Stats

python scripts/cli_memory.py stats

Clear

python scripts/cli_memory.py clear --yes

Features

Episodic Memory Extraction: LLM extracts structured summary + events (5W1H) from each turn
Fact Graph Construction: Extracts subject-predicate-object triples, builds NetworkX knowledge graph
Fact Deduplication: Two-stage dedup — normalized exact match (O(1)) + embedding similarity (threshold 0.93)
PPR Retrieval: Personalized PageRank over entity-fact-chunk graph for graph-aware retrieval
Three Recall Modes: Lite/Pro/Max controlling context granularity
Dynamic Integration: Auto-merges related memories (MODERATE or STRONG coherence)
Hybrid Retrieval: Dense (vector) + Sparse (BM25) with RRF fusion
Entity Linking: Embedding-based entity normalization (threshold 0.9)
Multimodal Support: Images participate in embedding and retrieval
LRU Cache: Reduces repeated embedding computation
NetworkX Graph: Full graph algorithms available (PPR, connected components, etc.)

Architecture

Store Pipeline:

Chunk storage (raw observation)
Episodic extraction (LLM) → summary + events
Fact extraction from events → subject-predicate-object triples
Fact deduplication (exact match + embedding similarity)
Entity node creation and fact graph construction (NetworkX)
Multimodal embedding
Candidate retrieval (dense similarity)
Integration judgment (LLM, MODERATE or STRONG → integrate)
Memory merge/insert

Recall Pipeline:

Query encoding
Strategy routing (DPR / Hybrid RRF / PPR)
Chunk retrieval (strategy-specific, returns top_k chunks with scores)
Fact retrieval (vector similarity, returns top_k facts)
Result assembly (mode-dependent):
- LITE: structured memory (summary + events) + facts
- PRO: + raw chunks (top_k)
- MAX: + backfill chunks (up to 2×top_k)

Graph Structure (NetworkX DiGraph):

Node types: entity, chunk
Edge types: entity_chunk (entity → chunk), fact (entity ↔ entity), synonymy (entity ↔ entity)
Fact deduplication: normalized exact match + embedding similarity (threshold 0.93)

File Structure

SEEM/
├── SKILL.md              # This file
├── README.md             # Quick reference
├── config.py             # Unified configuration (LLM + Embedding)
├── requirements.txt      # Python dependencies
├── __init__.py           # Package entry point
├── core/
│   ├── __init__.py
│   ├── seem_skill.py     # Core implementation (SEEMSkill class)
│   ├── schema.py         # Data structures (SEEMConfig, RecallMode, etc.)
│   ├── prompts.py        # LLM prompts
│   └── utils.py          # LLM client, embedding, BM25, cache
├── scripts/
│   └── cli_memory.py     # CLI: store, recall, facts, display, view, stats, clear
├── data/                 # Persistent storage (auto-created)
└── tests/

Dependencies

openai>=1.0.0 — LLM and embedding API client
numpy>=1.21.0 — Vector operations
networkx>=3.0 — Knowledge graph, PPR, connected components
scipy>=1.0 — Required by nx.pagerank()
rank-bm25>=0.2.2 — BM25 sparse retrieval
nltk>=3.8.0 — Tokenization

When to Use SEEM

Multi-turn conversations need structured context preservation
Complex event relationships exist across dialogue turns
Need entity-centric retrieval (fact graph + PPR)
Want control over context granularity (Lite/Pro/Max modes)
Dynamic memory integration is valuable

Troubleshooting

API Key Errors

Error: Missing API keys

Set environment variables or update config.py:

export LLM_API_KEY="sk-xxx"
export MM_ENCODER_API_KEY="sk-xxx"

PPR Requires scipy

ModuleNotFoundError: No module named 'scipy'

pip install scipy networkx

Comments

Loading comments...