Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

SEEM

v0.1.0

Advanced episodic memory system for multi-turn conversations. Store and retrieve structured conversation memories with fact graph, PPR retrieval, and three r...

0· 74·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for ryantoleco/seem-skill.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "SEEM" (ryantoleco/seem-skill) from ClawHub.
Skill page: https://clawhub.ai/ryantoleco/seem-skill
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Required env vars: LLM_API_KEY, MM_ENCODER_API_KEY
Required binaries: python3, pip
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install seem-skill

ClawHub CLI

Package manager switcher

npx clawhub@latest install seem-skill
Security Scan
VirusTotalVirusTotal
Suspicious
View report →
OpenClawOpenClaw
Benign
medium confidence
Purpose & Capability
Name/description (episodic memory, retrieval, embeddings) align with what the code asks for: python, pip, an LLM API key and an embedding API key. The required env vars (LLM_API_KEY, MM_ENCODER_API_KEY) and the Python modules listed in requirements.txt (openai, numpy, networkx, rank-bm25, etc.) are appropriate for the described functionality. Note: default base_url values in the config point at third-party endpoints (e.g., api.deepseek.com, api.siliconflow.cn); these are configurable but should be verified.
Instruction Scope
The SKILL.md and scripts direct the agent to send conversation text (and optionally images) to external LLM and embedding services and to run local CLI scripts that read/write the skill's local data. The CLI falls back to reading a local config.py if env vars are not set. There are no instructions to read unrelated system files or to transmit data to unexpected endpoints beyond the configured LLM/embed base_urls, but you should assume all stored conversation data and images will be transmitted to whatever endpoints are configured.
Install Mechanism
This skill is delivered with source files and a requirements.txt but has no automated install spec. That means installing/running it will typically require pip installing the listed packages. Dependencies are common for this domain and there are no obvious remote-download-or-extract steps in the manifest. Still, because source was published with no homepage and unknown owner, install from a controlled environment and inspect dependencies before pip installing.
Credentials
The skill requests two API keys that match its needs: an LLM key (primary) and an embeddings/MM encoder key. There are no unrelated credentials requested. Caveat: these API keys (and configured base_url values) allow the skill to send all conversation and image data to the remote LLM/embed providers you supply—ensure those providers are trusted and that keys are not reused for other sensitive services.
Persistence & Privilege
The skill persists memories and related indexes to disk (save/load logic referenced and CLI utilities create a data directory). always:false (no forced global inclusion) and disable-model-invocation:false (normal: agent or model can call the skill). This persistence is expected for a memory skill, but be aware stored data lives on the agent host and will be reloaded across runs when persistence is enabled.
Assessment
This skill appears internally consistent for a memory/retrieval system, but take these precautions before installing: - Verify and set the LLM/embedding base URLs to services you trust. Default config points at third-party domains (e.g., api.deepseek.com, api.siliconflow.cn); those endpoints will receive any conversation text and images. - Treat LLM_API_KEY and MM_ENCODER_API_KEY as sensitive secrets. Do not reuse them for unrelated accounts and prefer scoped/test keys. - Understand the skill persists memories to disk (under the skill directory). If you do not want local persistence, disable caching/persistence in the config (enable_cache) or inspect/modify save/load methods before use. - Because the package author and homepage are unknown, review network egress, the code paths that call the LLM/embedding APIs, and any save/load code before deploying in production. - If you need higher assurance, request provenance (author, repo, release tag) or run the skill in an isolated sandbox to observe network traffic and file writes. Confidence is medium because the implementation is coherent but the source is unpublished/unknown and the default endpoints are third-party; verifying endpoints and provenance would increase confidence.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

🧠 Clawdis
Binspython3, pip
EnvLLM_API_KEY, MM_ENCODER_API_KEY
Primary envLLM_API_KEY
latestvk97bwqh0eqxshpj9v7q16nwhp183ym0r
74downloads
0stars
1versions
Updated 4w ago
v0.1.0
MIT-0

SEEM Skill

Structured Episodic & Entity Memory for multi-turn conversations.

Quick Start

from seem_skill import SEEMSkill, SEEMConfig, RecallMode

config = SEEMConfig()
skill = SEEMSkill(config)

# Store conversation
memory_id = skill.store({
    "text": "Lena asked about Scottish Terriers",
    "speaker": "Alice"
})

# Recall (default: LITE mode — facts + episodic memory, no raw chunks)
result = skill.recall({"text": "What did Lena ask?"}, top_k=3)
# result = {"memories": [...], "facts": [...]}

# Recall with raw chunks
result = skill.recall({"text": "What did Lena ask?"}, mode=RecallMode.PRO)

# Recall with backfill
result = skill.recall({"text": "What did Lena ask?"}, mode=RecallMode.MAX)

Recall Modes

ModeFactsEpisodic MemoryRaw ChunksBackfill
Lite (default)✅ (summary + events)
Pro✅ (top_k)
Max✅ (top_k + backfill ≤ 2×top_k)
  • Lite: Lightest context. Facts + structured memory only. Best for LLM agents that want concise context.
  • Pro: Includes raw observation text for the top_k retrieved chunks.
  • Max: Full context with backfill from associated memories (up to 2×top_k chunks).

Retrieval Strategies

StrategyMethodBest For
DPRDense vector similaritySimple keyword-matching queries
Hybrid RRFDense + BM25 sparse fusionMixed keyword + semantic queries
PPRPersonalized PageRank over knowledge graphMulti-hop, entity-rich queries

Default strategy is configured in config.py (currently ppr).

Configuration

Environment Variables (Recommended)

export LLM_API_KEY="sk-xxx"
export LLM_BASE_URL="https://api.deepseek.com"
export LLM_MODEL="deepseek-chat"

export MM_ENCODER_API_KEY="sk-xxx"
export MM_ENCODER_BASE_URL="https://api.siliconflow.cn/v1"
export MM_ENCODER_MODEL="Qwen/Qwen3-Embedding-8B"

Unified Configuration File

All default settings are centralized in seem_skill/config.py:

LLM_CONFIG = {
    "base_url": "https://api.deepseek.com",
    "model": "deepseek-chat",
}

EMBEDDING_CONFIG = {
    "base_url": "https://api.siliconflow.cn/v1",
    "model": "Qwen/Qwen3-Embedding-8B",
}

Custom Configuration

Override defaults programmatically:

config = SEEMConfig(
    llm_api_key="your-key",
    llm_model="custom-model",
    retrieve_strategy=RetrieveStrategy.PPR,
    top_k_facts=10,
    ppr_damping=0.6,
)

Key Parameters

ParameterDefaultDescription
retrieve_strategyhybrid_rrfDPR / Hybrid RRF / PPR
top_k_chunks3Number of chunks to retrieve
top_k_facts5Number of fact triples to retrieve
top_k_candidates3Integration candidate count
rrf_rank_constant30RRF smoothing constant
ppr_damping0.5PPR teleport probability
backfill_chunks5Max additional chunks per backfill
enable_fact_graphTrueBuild fact graph on store
entity_similarity_threshold0.9Entity linking threshold
enable_integrationTrueDynamic memory integration
integration_window3Batch size for deferred integration

Operations

Store

python scripts/cli_memory.py store --text "Your message" --speaker user
python scripts/cli_memory.py store --dialogue-id "D1:1" --speaker "Alice" --text "Message"

Recall

python scripts/cli_memory.py recall --query "Your query" --mode lite
python scripts/cli_memory.py recall --query "Your query" --mode pro --strategy ppr --top-k 5
python scripts/cli_memory.py recall --query "Your query" --mode max --top-k-facts 10

Facts (Knowledge Graph)

python scripts/cli_memory.py facts               # Show all fact triples
python scripts/cli_memory.py facts --entity 小米   # Filter by entity

Display (Detailed)

python scripts/cli_memory.py display
python scripts/cli_memory.py display --dialogue-id "D1:1"

View (Compact 5W1H)

python scripts/cli_memory.py view

Stats

python scripts/cli_memory.py stats

Clear

python scripts/cli_memory.py clear --yes

Features

  • Episodic Memory Extraction: LLM extracts structured summary + events (5W1H) from each turn
  • Fact Graph Construction: Extracts subject-predicate-object triples, builds NetworkX knowledge graph
  • Fact Deduplication: Two-stage dedup — normalized exact match (O(1)) + embedding similarity (threshold 0.93)
  • PPR Retrieval: Personalized PageRank over entity-fact-chunk graph for graph-aware retrieval
  • Three Recall Modes: Lite/Pro/Max controlling context granularity
  • Dynamic Integration: Auto-merges related memories (MODERATE or STRONG coherence)
  • Hybrid Retrieval: Dense (vector) + Sparse (BM25) with RRF fusion
  • Entity Linking: Embedding-based entity normalization (threshold 0.9)
  • Multimodal Support: Images participate in embedding and retrieval
  • LRU Cache: Reduces repeated embedding computation
  • NetworkX Graph: Full graph algorithms available (PPR, connected components, etc.)

Architecture

Store Pipeline:

  1. Chunk storage (raw observation)
  2. Episodic extraction (LLM) → summary + events
  3. Fact extraction from events → subject-predicate-object triples
  4. Fact deduplication (exact match + embedding similarity)
  5. Entity node creation and fact graph construction (NetworkX)
  6. Multimodal embedding
  7. Candidate retrieval (dense similarity)
  8. Integration judgment (LLM, MODERATE or STRONG → integrate)
  9. Memory merge/insert

Recall Pipeline:

  1. Query encoding
  2. Strategy routing (DPR / Hybrid RRF / PPR)
  3. Chunk retrieval (strategy-specific, returns top_k chunks with scores)
  4. Fact retrieval (vector similarity, returns top_k facts)
  5. Result assembly (mode-dependent):
    • LITE: structured memory (summary + events) + facts
    • PRO: + raw chunks (top_k)
    • MAX: + backfill chunks (up to 2×top_k)

Graph Structure (NetworkX DiGraph):

  • Node types: entity, chunk
  • Edge types: entity_chunk (entity → chunk), fact (entity ↔ entity), synonymy (entity ↔ entity)
  • Fact deduplication: normalized exact match + embedding similarity (threshold 0.93)

File Structure

SEEM/
├── SKILL.md              # This file
├── README.md             # Quick reference
├── config.py             # Unified configuration (LLM + Embedding)
├── requirements.txt      # Python dependencies
├── __init__.py           # Package entry point
├── core/
│   ├── __init__.py
│   ├── seem_skill.py     # Core implementation (SEEMSkill class)
│   ├── schema.py         # Data structures (SEEMConfig, RecallMode, etc.)
│   ├── prompts.py        # LLM prompts
│   └── utils.py          # LLM client, embedding, BM25, cache
├── scripts/
│   └── cli_memory.py     # CLI: store, recall, facts, display, view, stats, clear
├── data/                 # Persistent storage (auto-created)
└── tests/

Dependencies

  • openai>=1.0.0 — LLM and embedding API client
  • numpy>=1.21.0 — Vector operations
  • networkx>=3.0 — Knowledge graph, PPR, connected components
  • scipy>=1.0 — Required by nx.pagerank()
  • rank-bm25>=0.2.2 — BM25 sparse retrieval
  • nltk>=3.8.0 — Tokenization

When to Use SEEM

  • Multi-turn conversations need structured context preservation
  • Complex event relationships exist across dialogue turns
  • Need entity-centric retrieval (fact graph + PPR)
  • Want control over context granularity (Lite/Pro/Max modes)
  • Dynamic memory integration is valuable

Troubleshooting

API Key Errors

Error: Missing API keys

Set environment variables or update config.py:

export LLM_API_KEY="sk-xxx"
export MM_ENCODER_API_KEY="sk-xxx"

PPR Requires scipy

ModuleNotFoundError: No module named 'scipy'
pip install scipy networkx

Comments

Loading comments...