seekdb
v0.2.0Operate seekdb via CLI commands and look up seekdb documentation. Use when: executing SQL, exploring table schemas, managing vector collections, registering...
Like a lobster shell, security has layers — review code before you run it.
License
SKILL.md
seekdb — AI-Agent Database CLI with Documentation
This skill lets AI Agents operate seekdb via seekdb-cli commands and look up seekdb documentation when needed.
Quick orientation: Start with
seekdb ai-guideto get a full JSON self-description of the CLI. Then run commands directly — no setup needed.
Part 1: seekdb-cli Operations
seekdb-cli is purpose-built for AI Agents. All output is structured JSON, all operations are stateless, and built-in safety guardrails prevent accidental data loss.
seekdb Deployment
seekdb supports two modes. Use this section to guide users to the right deployment path based on their OS and scenario.
Embedded Mode
seekdb runs as a library inside the application — no server process required. Install via pyseekdb:
pip install -U pyseekdb
Supported platforms: Linux (glibc ≥ 2.28), macOS 15+ · Architectures: x86_64, aarch64
Windows and older macOS versions do not support embedded mode. Use server mode instead.
Server Mode
A persistent seekdb process, connectable via MySQL client or seekdb-cli (remote DSN). Choose by OS:
| OS | Recommended method | Quick start |
|---|---|---|
| Linux (CentOS/RHEL/Anolis/openEuler) | Package manager (RPM) | curl -fsSL https://obportal.s3.ap-southeast-1.amazonaws.com/download-center/opensource/seekdb/seekdb_install.sh | sudo bash |
| Linux (Debian/Ubuntu) | Package manager (DEB) | sudo apt update && sudo apt install seekdb |
| macOS 15+ | Homebrew | brew tap oceanbase/seekdb && brew install seekdb |
| Any OS with Docker | Docker | docker run -d -p 2881:2881 oceanbase/seekdb |
| Windows / macOS (GUI) | OceanBase Desktop | Download from oceanbase.ai/download |
Connect after server mode deployment (default port 2881, user root, empty password):
mysql -h127.0.0.1 -uroot -P2881 -A -Dtest
# or via seekdb-cli:
seekdb --dsn "seekdb://root:@127.0.0.1:2881/test" status
Minimum requirements: 1 CPU core, 2 GB RAM, SSD storage.
For full deployment details, see the deployment docs.
Prerequisites
Check if seekdb-cli is installed:
seekdb --version
If not installed, choose the method that matches your environment:
Recommended — pipx (works globally without polluting system Python):
# Install pipx first if needed (Ubuntu/Debian)
sudo apt install pipx && pipx ensurepath
# Then install seekdb-cli
pipx install seekdb-cli
Alternative — pip (when inside a project venv or on systems without PEP 668):
pip install seekdb-cli
Note for Ubuntu 23.04+ / Debian 12+: Direct
pip installat the system level is blocked by PEP 668. Usepipxinstead — it creates an isolated environment while keeping theseekdbcommand globally available on your PATH.
Connection
seekdb-cli auto-discovers the connection (env var, .env, ~/.seekdb/config.env, or default ~/.seekdb/seekdb.db). No setup needed — just run commands directly.
If the user provides a specific DSN, pass it via --dsn (must appear before the subcommand):
# Remote mode
seekdb --dsn "seekdb://user:pass@host:port/db" schema tables
# Embedded mode (local database file)
seekdb --dsn "embedded:./seekdb.db" status
seekdb --dsn "embedded:~/.seekdb/seekdb.db?database=mydb" sql "SELECT 1"
DSN formats:
- Remote:
seekdb://user:pass@host:port/db - Embedded:
embedded:<path>[?database=<db>](default database:test)
Get a Full CLI Guide (run first)
seekdb ai-guide
Returns a structured JSON document with every command, parameter, workflow, safety rules, and output format. Run this once at the start of any seekdb task to orient yourself.
Recommended Workflows
SQL Database Exploration
1. seekdb schema tables → list all tables (name, column count, row count)
2. seekdb schema describe <table> → get column names, types, indexes, comments
3. seekdb table profile <table> → data statistics (null ratios, distinct, min/max, top values)
4. seekdb relations infer → infer JOIN relationships between tables
5. seekdb sql "SELECT ... LIMIT N" → execute SQL with explicit LIMIT
Vector Collection Workflow
1. seekdb collection list → list all collections
2. seekdb collection info <name> → collection details and document preview
3. seekdb query <collection> --text "..." → hybrid search (default: semantic + fulltext)
4. seekdb add <collection> --data '...' → add new documents
AI Model Setup Workflow
1. seekdb ai model list → check registered models
2. seekdb ai model create <name> --type <type> --model <model_name>
3. seekdb ai model endpoint create <ep> <model> --url <url> --access-key <key>
4. seekdb ai complete "<prompt>" --model <name> → test completion
Command Reference
seekdb sql
Execute SQL statements. Default is read-only mode.
# Read query
seekdb sql "SELECT id, name FROM users LIMIT 10"
# Read from file
seekdb sql --file query.sql
# Read from stdin
echo "SELECT 1" | seekdb sql --stdin
# Include table schema in output
seekdb sql "SELECT * FROM orders LIMIT 5" --with-schema
# Disable large-field truncation
seekdb sql "SELECT content FROM articles LIMIT 1" --no-truncate
# Write operation (requires --write flag)
seekdb sql --write "INSERT INTO users (name) VALUES ('Alice')"
seekdb sql --write "UPDATE users SET name = 'Bob' WHERE id = 1"
seekdb sql --write "DELETE FROM users WHERE id = 3"
Output format:
{"ok": true, "columns": ["id", "name"], "rows": [{"id": 1, "name": "Alice"}], "affected": 0, "time_ms": 12}
seekdb schema tables
seekdb schema tables
{"ok": true, "data": [{"name": "users", "columns": 5, "rows": 1200}, {"name": "orders", "columns": 8, "rows": 50000}]}
seekdb schema describe
seekdb schema describe orders
{"ok": true, "data": {"table": "orders", "comment": "Order table", "columns": [{"name": "id", "type": "int", "comment": "Order ID"}, {"name": "status", "type": "varchar(20)", "comment": "0=pending, 1=paid"}], "indexes": ["PRIMARY(id)", "idx_status(status)"]}}
seekdb schema dump
seekdb schema dump
Returns all CREATE TABLE DDL statements.
seekdb table profile
Generate statistical summary of a table without returning raw data. Helps understand data distribution before writing SQL.
seekdb table profile <table>
{"ok": true, "data": {
"table": "orders",
"row_count": 50000,
"columns": [
{"name": "id", "type": "int", "null_ratio": 0, "distinct": 50000, "min": 1, "max": 50000},
{"name": "user_id", "type": "int", "null_ratio": 0, "distinct": 1200, "min": 1, "max": 1500},
{"name": "amount", "type": "decimal(10,2)", "null_ratio": 0.02, "min": 0.5, "max": 9999.99},
{"name": "status", "type": "varchar(20)", "null_ratio": 0, "distinct": 4, "top_values": ["paid", "pending", "refunded", "cancelled"]},
{"name": "created_at", "type": "datetime", "null_ratio": 0, "min": "2024-01-01", "max": "2026-03-10"}
],
"candidate_join_keys": ["user_id"],
"candidate_time_columns": ["created_at"]
}}
seekdb relations infer
Infer JOIN relationships between tables by analyzing column name patterns (e.g., user_id → users.id) and type compatibility.
# Infer all table relationships
seekdb relations infer
# Infer for a specific table only
seekdb relations infer --table orders
{"ok": true, "data": [
{"from": "orders.user_id", "to": "users.id", "confidence": "high"},
{"from": "orders.product_id", "to": "products.id", "confidence": "high"},
{"from": "order_items.order_id", "to": "orders.id", "confidence": "high"}
]}
seekdb collection list
seekdb collection list
{"ok": true, "data": [{"name": "docs", "count": 1500}, {"name": "faq", "count": 200}]}
seekdb collection create
seekdb collection create my_docs --dimension 384 --distance cosine
seekdb collection create my_docs -d 768 --distance l2
Options: --dimension / -d (default: 384), --distance cosine | l2 | ip (default: cosine).
seekdb collection delete
seekdb collection delete my_docs
seekdb collection info
seekdb collection info my_docs
{"ok": true, "data": {"name": "my_docs", "count": 1500, "dimension": 384, "distance": "cosine", "preview": {"ids": ["doc1", "doc2"], "documents": ["Hello world", "Test doc"], "metadatas": [{"category": "test"}, {}]}}}
dimension and distance are included when available from the collection metadata.
seekdb query
Search a collection using hybrid (default), semantic (vector), or fulltext mode.
# Hybrid search (default: semantic + fulltext, RRF ranking)
seekdb query my_docs --text "how to deploy seekdb"
# Semantic (vector) only
seekdb query my_docs --text "how to deploy seekdb" --mode semantic
# Fulltext search
seekdb query my_docs --text "deployment guide" --mode fulltext
# With metadata filter
seekdb query my_docs --text "performance tuning" --where '{"category": "tech"}'
# Limit results (--limit or -n, default: 10)
seekdb query my_docs --text "seekdb" -n 5
{"ok": true, "data": {"results": [
{"id": "doc1", "score": 0.92, "document": "How to deploy seekdb...", "metadata": {"category": "tech"}},
{"id": "doc2", "score": 0.85, "document": "seekdb performance tuning...", "metadata": {"category": "tech"}}
], "count": 2}, "time_ms": 35}
seekdb get
Retrieve documents from a collection by IDs or metadata filter.
# Get by IDs
seekdb get my_docs --ids "doc1,doc2"
# Get by metadata filter (--limit or -n, default: 10)
seekdb get my_docs --where '{"category": "tech"}' -n 20
seekdb add
Add data to a collection. Exactly one source is required: --file, --stdin, or --data. The collection is auto-created if it does not exist.
# From file (JSON array, JSONL, or CSV)
seekdb add my_docs --file data.jsonl
seekdb add my_docs --file articles.csv --vectorize-column content
# Inline: single object or array
seekdb add my_docs --data '{"id":"1","document":"Hello world","metadata":{"source":"cli"}}'
seekdb add my_docs --data '[{"id":"a","document":"Doc A"},{"id":"b","document":"Doc B"}]'
# From stdin (JSON array or JSONL; use with pipes)
echo '{"id":"1","document":"from pipe"}' | seekdb add my_docs --stdin
some_script | seekdb add my_docs --stdin
Record format: Each record may have id (optional), document/text/content (text to vectorize), and any other fields become metadata. If embedding is present, it is used directly.
seekdb export
Export collection data to a file.
seekdb export my_docs --output backup.json
seekdb export my_docs --output backup.jsonl -n 5000
Options: --output (required), --limit / -n (default: 10000).
seekdb ai model list
List AI models registered in the database (from DBA_OB_AI_MODELS / DBMS_AI_SERVICE). Works in both remote and embedded mode.
seekdb ai model list
{"ok": true, "data": [{"name": "my_llm", "type": "completion", "model_name": "THUDM/GLM-4-9B-0414", "model_id": 1}]}
seekdb ai model create
Register an AI model via DBMS_AI_SERVICE.CREATE_AI_MODEL. Create an endpoint separately to use it for completion.
seekdb ai model create my_llm --type completion --model "THUDM/GLM-4-9B-0414"
seekdb ai model create my_embed --type dense_embedding --model "BAAI/bge-m3"
seekdb ai model create my_rerank --type rerank --model "<rerank_model>"
Types: completion, dense_embedding, rerank.
seekdb ai model delete
Drop an AI model. Drop any endpoints that use it first.
seekdb ai model delete my_llm
seekdb ai model endpoint create / delete
Create or drop an endpoint that binds an AI model to a URL and API key (so the database can call the model).
seekdb ai model endpoint create my_ep my_llm \
--url "https://api.siliconflow.cn/v1/chat/completions" \
--access-key "<YOUR_API_KEY>" \
--provider siliconflow
seekdb ai model endpoint delete my_ep
Supported --provider values:
| Provider | Vendor |
|---|---|
siliconflow | SiliconFlow (OpenAI-compatible) |
openAI | OpenAI |
deepseek | DeepSeek (OpenAI-compatible) |
aliyun-openAI | Alibaba Cloud (OpenAI-compatible) |
aliyun-dashscope | Alibaba Cloud DashScope |
hunyuan-openAI | Tencent Hunyuan (OpenAI-compatible) |
Common --url values (use the specific interface URL, not the base URL):
| Vendor | completion | embedding | rerank |
|---|---|---|---|
| SiliconFlow | https://api.siliconflow.cn/v1/chat/completions | https://api.siliconflow.cn/v1/embeddings | https://api.siliconflow.cn/v1/rerank |
| DeepSeek | https://api.deepseek.com/chat/completions | — | — |
| Alibaba (OpenAI) | https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions | https://dashscope.aliyuncs.com/compatible-mode/v1/embeddings | — |
| Tencent Hunyuan | https://api.hunyuan.cloud.tencent.com/v1/chat/completions | https://api.hunyuan.cloud.tencent.com/v1/embeddings | — |
seekdb ai complete
Run text completion using the database AI_COMPLETE function. Requires a registered completion model and an endpoint. Supported in both remote and embedded mode.
seekdb ai complete "Summarize this table structure" --model my_llm
{"ok": true, "data": {"model": "my_llm", "response": "The table has..."}, "time_ms": 1200}
seekdb ai-guide
Output a structured JSON guide for AI Agents containing all commands, parameters, workflow, and safety rules. Execute once to learn the full CLI.
seekdb ai-guide
seekdb status
seekdb status
Returns CLI version, server version, database name, and connectivity.
Safety Features
Row Protection
Queries without LIMIT are automatically probed. If result exceeds 100 rows, execution is blocked:
{"ok": false, "error": {"code": "LIMIT_REQUIRED", "message": "Query returns more than 100 rows. Please add LIMIT to your SQL."}}
Action: Add an explicit LIMIT clause and retry.
Write Protection
Write operations (INSERT/UPDATE/DELETE) are blocked by default:
{"ok": false, "error": {"code": "WRITE_NOT_ALLOWED", "message": "Write operations require --write flag."}}
Action: Add --write flag to enable write operations.
Even with --write, these are always blocked:
DELETE/UPDATEwithoutWHEREclauseDROP/TRUNCATEstatements
Error Auto-Correction
On SQL errors, the CLI automatically attaches schema hints:
Column not found → returns the table's column list and indexes:
{"ok": false, "error": {"code": "SQL_ERROR", "message": "Unknown column 'username'"}, "schema": {"table": "users", "columns": ["id", "name", "email"], "indexes": ["PRIMARY(id)"]}}
Table not found → returns available table names:
{"ok": false, "error": {"code": "SQL_ERROR", "message": "Table 'user' does not exist"}, "schema": {"tables": ["users", "orders", "products"]}}
Action: Use the schema info to correct the SQL and retry.
Large Field Truncation
TEXT/BLOB fields are truncated to 200 characters by default, with original length noted:
{"content": "First 200 characters of content...(truncated, 8520 chars)"}
Use --no-truncate to get full content when needed.
Sensitive Field Masking
Columns matching sensitive patterns are automatically masked:
| Pattern | Example Output |
|---|---|
| phone/mobile/tel | 138****5678 |
z***@gmail.com | |
| password/secret/token | ****** |
| id_card/ssn | 110***********1234 |
Output Formats
Default is JSON. Switch with --format / -f:
seekdb -f table sql "SELECT id, name FROM users LIMIT 5"
seekdb --format csv sql "SELECT id, name FROM users LIMIT 5"
seekdb -f jsonl sql "SELECT id, name FROM users LIMIT 5"
All formats now work with non-row data (e.g., schema tables, collection list). CSV and JSONL will auto-detect list-of-dict data in the data field.
Exit Codes
| Code | Meaning |
|---|---|
| 0 | Success |
| 1 | Business error (SQL error, connection error, etc.) |
| 2 | Usage error (missing arguments, invalid options) |
Operation Logging
All commands are logged to ~/.seekdb/sql-history.jsonl (seekdb-cli SQL execution history) for audit:
{"ts": "2026-03-12T14:23:01", "command": "sql", "sql": "SELECT id FROM users LIMIT 10", "ok": true, "rows": 10, "time_ms": 12}
Part 2: Documentation Lookup
Use documentation lookup when you need to understand a seekdb concept, look up SQL syntax, find configuration details, or answer user questions about seekdb.
Tip: For live database state (tables, schemas, data), run CLI commands directly. For concepts, syntax, and how-to guidance, search the docs.
When to Use Documentation Lookup
- Need to understand a seekdb concept (vector search, hybrid search, HNSW index, etc.)
- Need SQL or PL syntax reference not answerable by
seekdb schemacommands - Need deployment, configuration, or integration guidance
- CLI returned an error and you need to understand why from the docs
Path Resolution
- Read this SKILL.md to get its absolute path and extract the parent directory as
<skill_dir> - Catalog:
<skill_dir>references/seekdb-docs-catalog.jsonlIf missing locally, load from:https://raw.githubusercontent.com/oceanbase/seekdb-ecology-plugins/main/agent-skills/skills/seekdb/references/seekdb-docs-catalog.jsonl
Documentation Workflow
Step 1: Search Catalog
Search the catalog file for lines containing the query keywords. Each line is one JSON object with path, description, and branch.
# Example: search for "vector search" in the catalog
grep -i "vector search" <skill_dir>references/seekdb-docs-catalog.jsonl
Step 2: Match Query
- Extract
path,description, andbranchfrom search results - Select entries whose descriptions best match the query semantically
- Consider multiple matches for comprehensive answers
Step 3: Fetch Document
Fetch the document from the public docs URL:
- URL:
https://raw.githubusercontent.com/oceanbase/seekdb-doc/[branch]/en-US/[path] [branch]and[path]come from the catalog entry (e.g.,V1.0.0,V1.1.0)
Example
Query: "How to integrate with Claude Code?"
1. Search catalog for "claude code"
2. Found: {"path": "300.integrations/300.developer-tools/700.claude-code.md",
"description": "This guide explains how to use the seekdb plugin with Claude Code...",
"branch": "V1.0.0"}
3. Fetch: https://raw.githubusercontent.com/oceanbase/seekdb-doc/V1.0.0/en-US/300.integrations/300.developer-tools/700.claude-code.md
See references/doc-examples.md for more workflow examples.
If the information you need cannot be found in the catalog or the CLI output, visit the official seekdb documentation at oceanbase.ai/docs for the most complete and up-to-date reference.
Files
2 totalComments
Loading comments…
