Vector Store Shootout

8 vector store implementations behind a common interface — numpy, lancedb, qdrant, pgvector, weaviate, weaviate_hybrid, milvus, lightrag. Use when evaluating...

MIT-0 · Free to use, modify, and redistribute. No attribution required.

⭐ 0 · 168 · 0 current installs · 0 all-time installs

byNissan Dookeran@nissan

MIT-0

Security Scan

VirusTotal

Suspicious

View report →

OpenClaw

Suspicious

medium confidence

ℹ

Purpose & Capability

The skill name/description (eight vector store implementations behind a common interface) aligns with the included code files: numpy, lancedb, qdrant, pgvector, weaviate, milvus, lightrag, etc. Implementations perform expected actions for indexing/search and (where appropriate) persistent storage and cleanup. However, the SKILL metadata only requires python3 while the code imports many third-party libraries (requests, lancedb, qdrant_client, qdrant_client.models, psycopg2, pymilvus, networkx, pyarrow, numpy, etc.), so the declared requirements are incomplete relative to the code.

Instruction Scope

SKILL.md metadata states outbound networking is false / 'All backends run locally', but multiple store implementations make HTTP calls (requests.post to Ollama at http://localhost:11434 and to https://api.openai.com), and client libraries connect to networked services (Qdrant, Weaviate, Milvus, Postgres). The code will write temp directories, create/drop DB tables and collections, and may delete those resources on cleanup — expected for DB backends, but the network claim is misleading. The runtime instructions do not request any credentials explicitly, yet the code supports using an OpenAI API key and will send user text to embedding endpoints if configured, which means user data could leave the host depending on deployment.

Install Mechanism

There is no install spec despite many non-standard runtime dependencies. The skill is distributed as code files but does not declare or install required Python packages, increasing friction and risk (users may install libraries ad-hoc or run code without needed packages). Absence of a pinned dependency list or install steps is disproportionate to the task complexity and makes it unclear what will be installed or required on the host.

Credentials

The registry metadata declares no required environment variables or primary credential, but implementations accept and use an optional OpenAI API key parameter and will call remote embedding services (OpenAI) if provided. The mismatch between 'no credentials required' and code that will use credentials if supplied is confusing and could cause users to inadvertently supply a secret to a skill that didn't declare it. Additionally, network usage is environment-dependent but not made explicit in required configuration.

✓

Persistence & Privilege

The skill does not request elevated platform privileges, does not set always:true, and does not modify other skills. It creates temporary directories and persistent stores (LanceDB, Milvus Lite files, or database tables) as part of backend operation and provides cleanup methods that drop those resources — this behavior is expected for database backends and is scoped to the skill's own resources.

What to consider before installing

This skill appears to implement the vector stores it claims, but there are multiple practical inconsistencies you should address before installing or running it: - Dependencies: The code requires many Python packages (requests, numpy, networkx, lancedb, qdrant_client, psycopg2/pymilvus/pyarrow/weaviate-client, etc.) but the skill only declares python3. Don't run it on a production host without creating a pinned virtual environment (venv/conda) and installing and auditing those packages first. - Network vs local: The SKILL metadata implies no outbound networking, but the code calls embedding endpoints (default: local Ollama at http://localhost:11434) and optionally OpenAI (api.openai.com). If you want purely local operation, run an Ollama instance on localhost and avoid supplying OpenAI keys. If you supply an OpenAI key or use remote DB backends, data (the texts you index/query) will be sent to those services. - Secrets: The skill doesn't declare required env vars, yet it will accept and use an OpenAI API key if given. Only provide credentials if you trust the code and run it in an isolated environment; do not pass secrets you wouldn't want used for remote embedding/exfiltration. - Resource effects: Backends create temporary files, DB tables, and collections and delete them on cleanup; verify these operations are acceptable in your environment (especially if you point to an existing Postgres instance or other shared service). - Safety steps: Run the skill in a disposable VM or container first, pin dependency versions, inspect network traffic to ensure embeddings are sent to endpoints you expect, and consider providing a local embed_fn (test injection) to avoid network calls during evaluation. If you need a complete dependency/install spec and clearer network/credential documentation, request it from the publisher before using in production.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.0.0

Download zip

latestvk978g03q2w52mrr15f1sbm8vah82fm6b

License

MIT-0

Free to use, modify, and redistribute. No attribution required.

Termshttps://spdx.org/licenses/MIT-0.html

Runtime requirements

🔍 Clawdis

Binspython3

SKILL.md

Vector Store Shootout

Eight vector store backends with a common VectorStore interface. Swap backends by changing one line — the rest of your code stays the same.

Backends

Backend	Type	Dependencies	Best For
numpy	In-memory	numpy only	Prototyping, small datasets
lancedb	File-based	lancedb	Local persistence, Arrow-native
qdrant	Client-server	qdrant-client	Production, filtering
pgvector	Postgres extension	psycopg2	Existing Postgres deployments
weaviate	Client-server	weaviate-client	Hybrid search (BM25 + vector)
weaviate_hybrid	Client-server	weaviate-client	BM25-heavy hybrid (alpha=0.1)
milvus	Client-server	pymilvus	Large-scale, GPU-accelerated
lightrag	Graph-enhanced	lightrag	Graph + vector RAG

Common Interface

from base import VectorStore

class MyStore(VectorStore):
    async def add(self, texts, embeddings, metadatas): ...
    async def search(self, query_embedding, k=5): ...
    async def delete(self, ids): ...

Key Finding

Weaviate hybrid search at alpha=0.1 (BM25-heavy) scored avg 0.9940 vs 0.9700 at default 0.5. For technical content with specific terminology, keyword matching matters more than semantic similarity.

Files

scripts/base.py — Abstract base class
scripts/numpy_store.py through scripts/lightrag_store.py — All 8 implementations

Files

9 total

Select a file

Select a file to preview.

Comments

Loading comments…