{"skill":{"slug":"gradient-knowledge-base","displayName":"Gradient Knowledge Base","summary":"Community skill (unofficial) for DigitalOcean Gradient Knowledge Bases. Build RAG pipelines: store documents in DO Spaces, configure data sources, manage ind...","description":"---\nname: gradient-knowledge-base\ndescription: >\n  Community skill (unofficial) for DigitalOcean Gradient Knowledge Bases.\n  Build RAG pipelines: store documents in DO Spaces, configure data sources,\n  manage indexing, and run semantic or hybrid search queries.\nfiles: [\"scripts/*\"]\nhomepage: https://github.com/Rogue-Iteration/TheBigClaw\nmetadata:\n  clawdbot:\n    emoji: \"📚\"\n    primaryEnv: DO_API_TOKEN\n    requires:\n      env:\n        - DO_API_TOKEN\n        - DO_SPACES_ACCESS_KEY\n        - DO_SPACES_SECRET_KEY\n        - GRADIENT_API_KEY\n      bins:\n        - python3\n      pip:\n        - requests>=2.31.0\n        - boto3>=1.34.0\n  author: Rogue Iteration\n  version: \"0.1.4\"\n  tags: [\"digitalocean\", \"gradient-ai\", \"knowledge-base\", \"rag\", \"semantic-search\", \"do-spaces\"]\n---\n\n# 🦞 Gradient AI — Knowledge Bases & RAG\n\n> ⚠️ **This is an unofficial community skill**, not maintained by DigitalOcean. Use at your own risk.\n\n> *\"A lobster never forgets. Neither should your agent.\" — the KB lobster*\n\nBuild a [Retrieval-Augmented Generation](https://docs.digitalocean.com/products/gradient-ai-platform/details/features/#retrieval-augmented-generation-rag) pipeline using DigitalOcean's Gradient Knowledge Bases. Store your documents in DO Spaces, index them into a managed Knowledge Base (backed by OpenSearch), and query them with semantic or hybrid search.\n\n## Architecture\n\n```\nYour Agent                   DigitalOcean\n┌─────────────┐     upload    ┌──────────────┐\n│  Documents  │ ──────────▶  │  DO Spaces   │\n└─────────────┘              │  (S3-compat) │\n                              └──────┬───────┘\n                                     │ auto-index\n                              ┌──────▼───────┐\n                              │ Knowledge    │\n                              │ Base (KBaaS) │\n                              │ ┌──────────┐ │\n                              │ │OpenSearch│ │\n                              │ └──────────┘ │\n                              └──────┬───────┘\n                                     │ retrieve\n┌─────────────┐     answer    ┌──────▼───────┐\n│  Your Agent │ ◀──────────  │  RAG Results │\n│  + LLM      │              │  + Citations │\n└─────────────┘              └──────────────┘\n```\n\n📖 *[Knowledge Base docs](https://docs.digitalocean.com/products/gradient-ai-platform/how-to/create-manage-knowledge-bases/)*\n\n## API Endpoints\n\nThis skill connects to three official DigitalOcean service endpoints:\n\n| Hostname | Purpose | Docs |\n|----------|---------|------|\n| `api.digitalocean.com` | KB management (create, list, delete, data sources) | [DO API Reference](https://docs.digitalocean.com/reference/api/) |\n| `kbaas.do-ai.run` | KB retrieval — semantic/hybrid search queries | [KB Retrieval docs](https://docs.digitalocean.com/products/gradient-ai-platform/how-to/create-manage-knowledge-bases/) |\n| `inference.do-ai.run` | LLM chat completions for RAG synthesis | [Inference docs](https://docs.digitalocean.com/products/gradient-ai-platform/how-to/use-serverless-inference/) |\n| `<region>.digitaloceanspaces.com` | S3-compatible object storage | [Spaces docs](https://docs.digitalocean.com/products/spaces/) |\n\nAll endpoints are owned and operated by DigitalOcean. The `*.do-ai.run` hostnames are the Gradient AI Platform's service domains.\n\n## Authentication\n\nThis skill uses **two different credentials** — think of it as a two-claw approach:\n\n| Credential | Used For | Env Var |\n|------------|----------|---------|\n| DO API Token | KB management, indexing, queries | `DO_API_TOKEN` |\n| Gradient API Key | LLM inference for RAG synthesis | `GRADIENT_API_KEY` |\n| Spaces Keys | S3-compatible uploads | `DO_SPACES_ACCESS_KEY` + `DO_SPACES_SECRET_KEY` |\n\n> **Credential scoping:** Use minimally-scoped tokens. Create a dedicated [Model Access Key](https://docs.digitalocean.com/products/gradient-ai-platform/how-to/manage-access-keys/) for `GRADIENT_API_KEY`. For `DO_API_TOKEN`, use a [scoped API token](https://docs.digitalocean.com/reference/api/create-personal-access-token/) with only Knowledge Base and Spaces permissions. Avoid using your account-root token.\n\nOptional but recommended:\n```bash\nexport GRADIENT_KB_UUID=\"your-kb-uuid\"     # Default KB for queries\nexport DO_SPACES_BUCKET=\"your-bucket\"      # Default bucket for uploads\nexport DO_SPACES_ENDPOINT=\"https://nyc3.digitaloceanspaces.com\"\n```\n\n---\n\n## Tools\n\n### 📦 Store Documents in Spaces\n\nUpload files to DO Spaces for Knowledge Base indexing. This is the storage layer — documents land here before being indexed.\n\n```bash\n# Upload a file\npython3 gradient_spaces.py --upload /path/to/report.md --bucket my-kb-data\n\n# Upload with a key prefix (folder structure)\npython3 gradient_spaces.py --upload report.md --bucket my-kb-data --prefix \"research/2026-02-15/\"\n\n# List files in a bucket\npython3 gradient_spaces.py --list --bucket my-kb-data\n\n# List files with a prefix filter\npython3 gradient_spaces.py --list --bucket my-kb-data --prefix \"research/\"\n\n# Delete a file\npython3 gradient_spaces.py --delete \"research/old_report.md\" --bucket my-kb-data\n```\n\n📖 *[DO Spaces docs](https://docs.digitalocean.com/products/spaces/)*\n\n---\n\n### 🏗️ Create and Manage Knowledge Bases\n\nFull CRUD for Knowledge Bases. Create them programmatically instead of clicking through the console like a land-dweller.\n\n```bash\n# List all Knowledge Bases\npython3 gradient_kb_manage.py --list\n\n# Create a new KB\npython3 gradient_kb_manage.py --create --name \"My Research KB\" --region nyc3\n\n# Show details for a specific KB\npython3 gradient_kb_manage.py --show --kb-uuid \"your-kb-uuid\"\n\n# Delete a KB (⚠️ permanent!)\npython3 gradient_kb_manage.py --delete --kb-uuid \"your-kb-uuid\"\n```\n\n📖 *[Create KBs via API](https://docs.digitalocean.com/products/gradient-ai-platform/how-to/create-manage-knowledge-bases/)*\n\n---\n\n### 📁 Manage Data Sources\n\nConnect your Spaces bucket (or web URLs) to a Knowledge Base. This is what tells the KB \"index these documents.\"\n\n```bash\n# Add a DO Spaces data source\npython3 gradient_kb_manage.py --add-source \\\n  --kb-uuid \"your-kb-uuid\" \\\n  --bucket my-kb-data \\\n  --prefix \"research/\"\n\n# List data sources for a KB\npython3 gradient_kb_manage.py --list-sources --kb-uuid \"your-kb-uuid\"\n\n# Trigger re-indexing (auto-detects the data source)\npython3 gradient_kb_manage.py --reindex --kb-uuid \"your-kb-uuid\"\n\n# Trigger re-indexing for a specific source\npython3 gradient_kb_manage.py --reindex --kb-uuid \"your-kb-uuid\" --source-uuid \"ds-uuid\"\n```\n\n> **🦞 Pro tip: Auto-indexing.** If your KB has auto-indexing enabled, you can skip manual re-index triggers. The KB will detect changes in your Spaces bucket automatically. Configure it in the [DigitalOcean Console](https://cloud.digitalocean.com) → Knowledge Base → Settings.\n\n---\n\n### 🔍 Query the Knowledge Base\n\nSearch your indexed documents with semantic or hybrid queries. This is where the magic happens — your documents become answers.\n\n```bash\n# Basic query\npython3 gradient_kb_query.py --query \"What happened with the Q4 earnings?\"\n\n# Control number of results\npython3 gradient_kb_query.py --query \"Revenue trends\" --num-results 20\n\n# Tune hybrid search balance (see below)\npython3 gradient_kb_query.py --query \"$CAKE price movement\" --alpha 0.5\n\n# JSON output (for piping to other tools)\npython3 gradient_kb_query.py --query \"SEC filings summary\" --json\n```\n\n**Direct API call:**\n```bash\ncurl -s https://kbaas.do-ai.run/v1/{kb-uuid}/retrieve \\\n  -H \"Authorization: Bearer $DO_API_TOKEN\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"query\": \"What happened with Q4 earnings?\",\n    \"num_results\": 10,\n    \"alpha\": 0.5\n  }'\n```\n\n📖 *[KB retrieval API](https://docs.digitalocean.com/products/gradient-ai-platform/how-to/create-manage-knowledge-bases/#query-a-knowledge-base)*\n\n---\n\n### 🎛️ The `alpha` Parameter — Hybrid Search Tuning\n\nThis is the secret sauce. The `alpha` parameter controls the balance between **lexical** (keyword) and **semantic** (meaning) search:\n\n| Alpha | Behavior | Best For |\n|-------|----------|----------|\n| `0.0` | Pure lexical (keyword matching) | Exact terms: ticker symbols, filing numbers, dates |\n| `0.5` | Balanced hybrid | General research queries |\n| `1.0` | Pure semantic (meaning-based) | Open-ended: \"what happened with...\", \"summarize...\" |\n\n> **🦞 Rule of claw:** Start at `0.5`. Go lower when searching for specific things (`$CAKE`, `10-K`, `2026-02-15`). Go higher when exploring ideas (\"What's the market sentiment?\").\n\n---\n\n### 🧠 RAG-Enhanced Queries\n\nThe full pipeline: query the KB → build a context prompt → call an LLM to synthesize. One command, complete answers with citations.\n\n```bash\npython3 gradient_kb_query.py \\\n  --query \"Summarize all research on $CAKE\" \\\n  --rag \\\n  --model \"openai-gpt-oss-120b\"\n```\n\nThis automatically:\n1. 🔍 Queries the Knowledge Base for relevant documents\n2. 📝 Builds a prompt with the retrieved context\n3. 🤖 Calls the LLM to synthesize an answer\n\n> **Note:** RAG queries call the [Gradient Inference API](https://docs.digitalocean.com/products/gradient-ai-platform/how-to/use-serverless-inference/) under the hood, so you'll need `GRADIENT_API_KEY` set. If you have the `gradient-inference` skill loaded too, you're all set.\n\n---\n\n## Advanced Configuration\n\n### Embedding Models & Chunking\n\nWhen creating a Knowledge Base, you can choose how documents are split into searchable chunks:\n\n| Strategy | How It Works | Best For |\n|----------|-------------|----------|\n| **Section-based** | Splits on document structure (headings, paragraphs) | Structured reports |\n| **Semantic** | Splits on meaning boundaries | Narrative content |\n| **Hierarchical** | Preserves document hierarchy in chunks | Technical docs |\n| **Fixed-length** | Equal-sized chunks | Uniform data |\n\nConfigure these in the [DigitalOcean Console](https://cloud.digitalocean.com) when creating the KB, or via the API's `embedding_model` and chunking parameters.\n\n📖 *[KB configuration options](https://docs.digitalocean.com/products/gradient-ai-platform/details/features/#retrieval-augmented-generation-rag)*\n\n---\n\n## CLI Reference\n\nAll scripts accept `--json` for machine-readable output.\n\n```\ngradient_spaces.py      --upload FILE | --list | --delete KEY\n                        [--bucket NAME] [--prefix PATH] [--key KEY] [--json]\n\ngradient_kb_manage.py   --list | --create | --show | --delete\n                        | --list-sources | --add-source | --reindex\n                        [--kb-uuid UUID] [--source-uuid UUID]\n                        [--name NAME] [--region REGION] [--bucket NAME]\n                        [--prefix PATH] [--json]\n\ngradient_kb_query.py    --query TEXT [--kb-uuid UUID] [--num-results N]\n                        [--alpha F] [--rag] [--model ID] [--json]\n```\n\n## Environment Variables\n\n| Variable | Required | Description |\n|----------|----------|-------------|\n| `DO_API_TOKEN` | ✅ | DO API token (scopes: GenAI + Spaces) |\n| `DO_SPACES_ACCESS_KEY` | ✅ | Spaces access key |\n| `DO_SPACES_SECRET_KEY` | ✅ | Spaces secret key |\n| `DO_SPACES_ENDPOINT` | Optional | Spaces endpoint (default: `https://nyc3.digitaloceanspaces.com`) |\n| `DO_SPACES_BUCKET` | Optional | Default bucket name |\n| `GRADIENT_KB_UUID` | Optional | Default KB UUID (saves typing `--kb-uuid` every time) |\n| `GRADIENT_API_KEY` | For RAG | Needed when using `--rag` for LLM synthesis |\n\n## External Endpoints\n\n| Endpoint | Purpose |\n|----------|---------|\n| `https://kbaas.do-ai.run/v1/{uuid}/retrieve` | KB retrieval API |\n| `https://api.digitalocean.com/v2/gen-ai/knowledge_bases/` | KB management API |\n| `https://{region}.digitaloceanspaces.com` | DO Spaces (S3-compatible) |\n\n## Security & Privacy\n\n- Your `DO_API_TOKEN` is sent as a Bearer token to `api.digitalocean.com` and `kbaas.do-ai.run`\n- Spaces credentials are used for S3-compatible uploads to `{region}.digitaloceanspaces.com`\n- Documents you upload become **private** in your Spaces bucket by default\n- KB queries are scoped to your account — no cross-tenant access\n- No credentials or data are sent to any third-party endpoints\n\n## Trust Statement\n\n> By using this skill, documents and queries are sent to DigitalOcean's Knowledge Base\n> and Spaces APIs. Only install if you trust DigitalOcean with the documents you index.\n\n## Important Notes\n\n- Documents uploaded to Spaces are **private by default**\n- Re-indexing is **best-effort** — if the API call fails, auto-indexing kicks in on its own schedule\n- The retrieval API returns document **chunks**, not full documents\n- Deleting a KB is **permanent** — the indexed data is gone. The source files in Spaces are not affected.\n","tags":{"digitalocean":"0.1.4","do-spaces":"0.1.4","gradient-ai":"0.1.4","knowledge-base":"0.1.4","latest":"0.1.4","rag":"0.1.4","semantic-search":"0.1.4"},"stats":{"comments":0,"downloads":327,"installsAllTime":12,"installsCurrent":0,"stars":0,"versions":5},"createdAt":1771236287672,"updatedAt":1778491556797},"latestVersion":{"version":"0.1.4","createdAt":1771265252491,"changelog":"Document API endpoints (kbaas.do-ai.run, inference.do-ai.run) and add credential scoping guidance","license":null},"metadata":{"setup":[{"key":"DO_API_TOKEN","required":true},{"key":"DO_SPACES_ACCESS_KEY","required":true},{"key":"DO_SPACES_SECRET_KEY","required":true},{"key":"GRADIENT_API_KEY","required":true}],"os":null,"systems":null},"owner":{"handle":"simondelorean","userId":"s1729jqtfk8t0ept62xsvphczd884php","displayName":"Simon DeLorean","image":"https://avatars.githubusercontent.com/u/232737?v=4"},"moderation":null}