Install
openclaw skills install alibabacloud-tablestore-agent-storageAlibaba Cloud Tablestore Agent Storage Skill. Use for building and managing Tablestore-based knowledge bases with the `tablestore-agent-storage` Python SDK. Triggers: "知识库", "tablestore", "ots", "表格存储", "agent storage", "knowledge base", "向量检索", "文档上传", "文档导入", "知识库同步", "tablestore-agent-storage", "AgentStorageClient"
openclaw skills install alibabacloud-tablestore-agent-storageYou are responsible for helping users build and manage Tablestore knowledge bases using the tablestore-agent-storage Python SDK.
Complete the following tasks:
Ask questions in stages — at most 1–2 categories of information per round. Never request all configuration at once.
Prioritize completing:
Only after that, ask about:
All generated files go in: tablestore_agent_storage/
Create the directory automatically on first use.
Fixed file paths:
tablestore_agent_storage/ots_kb_config.jsontablestore_agent_storage/sync_knowledge_base.pytablestore_agent_storage/.sync_cache.jsonDo not place files in the project root directory.
Once configuration is collected, it must be written to tablestore_agent_storage/ots_kb_config.json.
The timeout for each interaction with the Tablestore server is the timeout of the Tablestore Agent Storage Client call (default 30s).
The agent may retry due to timeout, network jitter, etc. All write operations must be idempotent to safely support retries. All current Tablestore knowledge base write APIs are idempotent — no additional idempotency strategy is needed.
| Operation | Idempotent |
|---|---|
create_knowledge_base | Yes |
upload_documents / add_documents | Yes |
Any delete operation is not supported and must never be executed under any circumstances. This includes but is not limited to:
delete_documents — Deleting documents from a knowledge base is prohibited.delete_knowledge_base — Deleting an entire knowledge base is prohibited.delete_instance — Deleting a Tablestore instance is prohibited.Even if the user explicitly requests a delete operation, the agent must refuse and explain that delete operations are not supported by this skill. Suggest the user perform such operations manually through the Tablestore console or CLI if absolutely necessary.
First confirm:
Installation command:
pip install tablestore-agent-storage==1.0.4
If the installation times out, try these troubleshooting steps:
Install with another source:
pip install tablestore-agent-storage==1.0.4 -i https://pypi.tuna.tsinghua.edu.cn/simple
If using pyenv and installation hangs or times out:
# Try running pyenv rehash manually first (~/.pyenv/shims/.pyenv-shim can be removed safely)
rm -f ~/.pyenv/shims/.pyenv-shim && pyenv rehash
# Then retry pip install
pip install tablestore-agent-storage==1.0.4
First, only ask:
In the next round, ask:
ots_endpointots_instance_nameimport json
from alibabacloud_credentials.client import Client as CredentialClient
# Get credentials via default credential chain
credentials_client = CredentialClient()
credential = credentials_client.get_credential()
access_key_id = credential.get_access_key_id()
access_key_secret = credential.get_access_key_secret()
sts_token = credential.get_security_token()
# now you can save the credentials into config
After collecting ots_endpoint and ots_instance_name, verify whether the instance exists. If it does not, automatically create it using the Tablestore CLI.
See references/tablestore-instance.md for detailed instance operations.
Workflow:
ots_endpoint:
http://ots-cn-hangzhou.aliyuncs.com → cn-hangzhoutablestore_cli list_instance -r <region_id>
If the instance name appears in the returned list, skip creation.tablestore_cli create_instance -n <instance_name> -r <region_id> -d "Auto-created by Agent"
tablestore_cli describe_instance -r <region_id> -n <instance_name>
Confirm "Status": 1 (active) before proceeding.Notes:
export OTS_USER_AGENT=AlibabaCloud-Agent-Skills. Do not save the user agent to the config file.ots_endpoint format must be http://ots-<region-id>.aliyuncs.com, not https://<instance-name>.<region-id>.ots.aliyuncs.com.Only ask:
If the user wants to create a new one, optionally ask for a description.
tablestore_agent_storage/ots_kb_config.json.{
"access_key_id": "",
"access_key_secret": "",
"sts_token": "",
"ots_endpoint": "", // Must match: ^http://ots-[a-zA-Z0-9\-]+.aliyuncs.com$
"ots_instance_name": "", // Must match: ^[a-zA-Z0-9-]+$
"oss_endpoint": "", // Must match: ^https?://[a-zA-Z0-9\-\.]+$
"oss_bucket_name": "", // Must match: ^[a-zA-Z0-9-]+$
"knowledge_bases": []
}
Execute based on user needs:
create_knowledge_baselist_knowledge_basedescribe_knowledge_baseAfter basic features are complete, proactively ask the user whether they need:
Only continue asking about OSS and sync configuration after the user confirms.
Only ask:
oss_endpointoss_bucket_nameBefore using OSS-related features, the AliyunOTSAccessingOSSRole service-linked role must be created and authorized. This role allows Tablestore to access OSS on behalf of the user. This is a one-time setup. If the role has already been authorized, this authorization step can be skipped.
Guide the user to complete authorization via the following link. See references/ram-policies.md for details.
https://ram.console.aliyun.com/authorize?request=%7B%22payloads%22%3A%5B%7B%22missionId%22%3A%22Tablestore.RoleForOTSAccessingOSS%22%7D%5D%2C%22callback%22%3A%22https%3A%2F%2Fotsnext.console.aliyun.com%2F%22%2C%22referrer%22%3A%22Tablestore%22%7D
Notes:
access_key_id, access_key_secret, and sts_token can be reusedFirst ask:
local_pathoss_sync_pathThen ask:
sync_interval_minutes (default: 5)inclusion_filters (default: ["*.pdf", "*.docx", "*.txt", "*.md", "*.html"])If the user confirms local directory linking, create:
tablestore_agent_storage/sync_knowledge_base.py
The script must:
add_documents to import into the knowledge base.sync_cache.json for incremental cachingIf using OpenClaw, prefer OpenClaw Cron, for example:
openclaw cron add --name "kb-sync" --every 5m --message "Please run the knowledge base sync script: cd /your/project && python3 tablestore_agent_storage/sync_knowledge_base.py"
If OpenClaw is not available, fall back to system Crontab.
OTS only (when local file upload is not needed):
import json
from tablestore_agent_storage import AgentStorageClient
config = json.load(open("tablestore_agent_storage/ots_kb_config.json", "r"))
client = AgentStorageClient(
access_key_id=config["access_key_id"],
access_key_secret=config["access_key_secret"],
sts_token=config.get("sts_token"), # STS temporary credential, optional
ots_endpoint=config["ots_endpoint"],
ots_instance_name=config["ots_instance_name"]
)
OTS + OSS (OSS configuration is only needed when uploading local files):
client = AgentStorageClient(
access_key_id=config["access_key_id"],
access_key_secret=config["access_key_secret"],
sts_token=config.get("sts_token"),
oss_endpoint=config["oss_endpoint"], # Must be in the same region as OTS
oss_bucket_name=config["oss_bucket_name"],
ots_endpoint=config["ots_endpoint"],
ots_instance_name=config["ots_instance_name"]
)
subspace is a logical partition within a knowledge base, used to isolate documents from different sources or categories.
"subspace": true when creating a knowledge base to enable the subspace featuresubspace is a string specifying which subspace to operate onsubspace is a list of strings, allowing simultaneous search across multiple subspacessubspace is not specified, the _default subspace is usedBasic creation:
client.create_knowledge_base({
"knowledgeBaseName": "my_kb",
"description": "My knowledge base"
})
With subspace + custom metadata fields:
When creating a knowledge base, you can define metadata fields via the metadata parameter, supporting MetadataField, MetadataFieldType, EmbeddingConfiguration, and other models.
See references/metadata.md for detailed usage.
Quick example:
client.create_knowledge_base({
"knowledgeBaseName": "my_kb",
"subspace": True,
"metadata": [
{"name": "author", "type": "string"},
{"name": "version", "type": "long"}
]
})
# List all knowledge bases (supports pagination)
client.list_knowledge_base({"maxResults": 20, "nextToken": ""})
# View details of a single knowledge base
client.describe_knowledge_base({"knowledgeBaseName": "my_kb"})
# Upload a single file to the default subspace
client.upload_documents({
"knowledgeBaseName": "my_kb",
"documents": [
{"filePath": "/path/to/file.pdf"},
{"filePath": "/path/to/doc.docx", "metadata": {"author": "aliyun"}}
]
})
# Upload to a specific subspace
client.upload_documents({
"knowledgeBaseName": "my_kb",
"subspace": "finance",
"documents": [
{"filePath": "/path/to/report.pdf", "metadata": {"version": 2}}
]
})
# Import a single file
client.add_documents({
"knowledgeBaseName": "my_kb",
"documents": [
{"ossKey": "oss://your-bucket/docs/file.pdf"}
]
})
# Import an OSS directory (supports file type filtering)
client.add_documents({
"knowledgeBaseName": "my_kb",
"subspace": "tech_docs",
"documents": [
{
"ossKey": "oss://your-bucket/synced-folder/",
"inclusionFilters": ["*.pdf", "*.docx", "*.md"],
"exclusionFilters": ["*draft*"],
"metadata": {"source": "oss_sync"}
}
]
})
# Query by docId
client.get_document({
"knowledgeBaseName": "my_kb",
"docId": "your_doc_id"
})
# Query by ossKey
client.get_document({
"knowledgeBaseName": "my_kb",
"ossKey": "oss://your-bucket/docs/file.pdf",
"subspace": "tech_docs"
})
Document statuses:
pending — Processingcompleted — Completedfailed — Processing failed# List all documents in a knowledge base (supports pagination)
client.list_documents({
"knowledgeBaseName": "my_kb",
"maxResults": 20,
"nextToken": ""
})
# List documents in specific subspaces
client.list_documents({
"knowledgeBaseName": "my_kb",
"subspace": ["finance", "tech_docs"],
"maxResults": 50
})
Hybrid retrieval (recommended, DENSE_VECTOR + FULL_TEXT):
client.retrieve({
"knowledgeBaseName": "my_kb",
"retrievalQuery": {
"text": "your question",
"type": "TEXT"
},
"retrievalConfiguration": {
"searchType": ["DENSE_VECTOR", "FULL_TEXT"],
"denseVectorSearchConfiguration": {"numberOfResults": 10},
"fullTextSearchConfiguration": {"numberOfResults": 10},
"rerankingConfiguration": {
"type": "RRF",
"numberOfResults": 10,
"rrfConfiguration": {
"denseVectorSearchWeight": 1.0,
"fullTextSearchWeight": 1.0,
"k": 60
}
}
}
})
Vector-only retrieval:
client.retrieve({
"knowledgeBaseName": "my_kb",
"retrievalQuery": {"text": "your question", "type": "TEXT"},
"retrievalConfiguration": {
"searchType": ["DENSE_VECTOR"],
"denseVectorSearchConfiguration": {"numberOfResults": 10}
}
})
Retrieval with metadata filtering:
You can pass a MetadataFilter object via the filter parameter during retrieval for metadata-based filtering. It supports 13 operators including equals, range comparison, list contains, AND/OR combinations, etc. See references/metadata.md for detailed usage.
Follow this order — do not skip steps, and do not ask too many questions at once.
Let me first check your basic environment. Please confirm:
- Is Python 3.8 or higher available in your current environment? 2. May I install
tablestore-agent-storage?
Credentials require the following three pieces of information:
access_key_id2.access_key_secret3.sts_token(optional)
Note: You may ask the user how to obtain credentials (e.g., where the credentials config file is located), but you must never display them directly, nor ask the user for plaintext AK/SK.
Two more OTS configuration items are needed:
ots_endpoint2.ots_instance_name
Note: The
ots_endpointformat must behttp://ots-<region-id>.aliyuncs.com, nothttps://<instance-name>.<region-id>.ots.aliyuncs.com.
Please confirm:
- Do you want to create a new knowledge base, or use an existing one?
- What is the knowledge base name?
After basic configuration is complete, do you also need:
- Upload local files
- Link a local directory with automatic sync
If you need local file upload or automatic sync, please provide:
oss_endpoint2.oss_bucket_name
Please provide directory sync information:
- Local directory path
local_path2. OSS sync path prefixoss_sync_path- Sync interval (minutes, default: 5) 4. File type filter (default:
*.pdf, *.docx, *.txt, *.md, *.html)
echo, print, or logging. Handle all secrets exclusively through backend code.delete_documents, delete_knowledge_base, delete_instance, or any other delete/removal/destroy action) — all delete operations are strictly forbidden, even if the user explicitly requests them