Ai Integration

v1.0.0

Use when building AI-powered features with the Claude API or Anthropic SDK — structured outputs, tool calling, streaming, multi-provider routing, multi-agent...

⭐ 0· 94·0 current·0 all-time

by@jimmy974

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for jimmy974/ai-integration.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "Ai Integration" (jimmy974/ai-integration) from ClawHub.
Skill page: https://clawhub.ai/jimmy974/ai-integration
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install ai-integration

ClawHub CLI

Package manager switcher

npx clawhub@latest install ai-integration

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Suspicious

medium confidence

ℹ

Purpose & Capability

The name/description (Anthropic/Claude integration, structured outputs, tool calling, multi-agent) aligns with the SKILL.md content and code examples. However, the skill does not declare any required environment variables (e.g., Anthropic API keys) or primary credential even though the examples assume an Anthropic client and multi-provider routing; that omission is notable and reduces clarity about what secrets the integration will need.

Instruction Scope

The runtime instructions include examples that define tools such as a read_file tool (path parameter) and web-search tools and show agentic loops that can call tools autonomously. Those examples implicitly permit reading arbitrary local files and making external calls unless implementers add constraints; the SKILL.md does not explicitly instruct limiting file paths, validating inputs, or preventing sensitive-data reads, which is scope creep relative to a simple integration guide.

✓

Install Mechanism

This is an instruction-only skill with no install spec and no code files. That minimizes direct install risk because nothing is downloaded or written by the skill itself.

Credentials

The skill describes using Anthropic, LiteLLM, and other providers but declares no environment variables or primary credential. Real integrations will require API keys or credentials; the absence of declared env vars is an inconsistency that makes it unclear what secrets the agent or developer must supply and how they will be used.

✓

Persistence & Privilege

always is false and the skill does not request persistent or system-wide modifications. Autonomous invocation (model-invocation not disabled) is the default; it is only a concern combined with the instruction scope issues (tooling that can read local files).

What to consider before installing

This is an authored guide for building Anthropic/Claude integrations and is broadly coherent, but it leaves two practical security questions unanswered: (1) it doesn't declare required API keys or credentials (you will almost certainly need an Anthropic API key and possibly other provider credentials), and (2) its tool-calling examples include a read_file tool and open tool definitions that — if implemented without safeguards — allow agents to read arbitrary local files or call external endpoints. Before installing or enabling this skill: 1) confirm with the publisher what credentials are required and how they should be provided/stored; 2) if you implement any tools the skill suggests, enforce strict input validation and path whitelisting (deny access to /etc, home/.ssh, vaults, etc.); 3) avoid giving an autonomous agent unrestricted filesystem or network access — prefer manual review or tightly scoped tools; 4) review the full SKILL.md for any other implicit behaviors (streaming, multi-provider routing) and only enable the parts you need. If the publisher can supply an updated SKILL.md that explicitly lists required env vars and documents safe tool constraints, my confidence in this assessment would increase and many concerns would be resolved.

Like a lobster shell, security has layers — review code before you run it.

latestvk970pb3t34m74kbsx5gra37y7583sc08

94downloads

0stars

1versions

Updated 1mo ago

v1.0.0

MIT-0

AI Integration Skill

Comprehensive patterns for integrating the Anthropic Claude API into production systems — from basic API calls to full multi-agent orchestration with state management, memory, and evaluation.

When to Use This Skill

Activate when:

Building a Claude API integration or wrapper
Implementing structured outputs, tool calling, or streaming
Setting up multi-provider LLM routing (LiteLLM, fallbacks)
Designing multi-agent orchestration or agentic loops
Implementing RAG or persistent agent memory
Evaluating LLM output quality or building evals
Deploying agents to Next.js, Python FastAPI, or Docker

Don't use this skill for:

Kubernetes/Terraform config unrelated to AI infra
General React/Next.js features not involving LLM calls

Core Principles

1. Single Agent vs Multi-Agent

Pattern	When to Use	Cost
Single agent	Linear tasks, simple I/O, <5 steps	Low
Subagent delegation	Parallel tasks, specialized expertise needed	Medium
Multi-agent swarm	Complex autonomous workflows, >10 steps	High — budget like a team

Infrastructure math (2026): Multi-agent compute costs jump ~3x when moving from single to orchestrated swarms. Budget before you build.

2. Agent Communication Patterns

Hub-and-spoke (most common): Orchestrator delegates to specialist agents.

orchestrator
  ├── researcher-agent   (web search, docs)
  ├── coder-agent        (code generation, tests)
  └── reviewer-agent     (quality, security check)

Pipeline: Output of one agent is input to next (linear, predictable).

Swarm: Agents with shared memory, no single orchestrator. Use for exploration tasks.

3. Context Window Management

import anthropic

client = anthropic.Anthropic()

def sliding_window(messages: list[dict], max_tokens: int = 150_000) -> list[dict]:
    """Drop oldest messages to stay within token budget."""
    # Rough estimate: 1 token ≈ 4 chars
    while len(messages) > 2:
        total = sum(len(m["content"]) // 4 for m in messages)
        if total <= max_tokens:
            break
        messages = messages[1:]  # drop oldest non-system message
    return messages

def summarize_history(messages: list[dict]) -> list[dict]:
    """Compress old turns into a summary to reclaim context budget."""
    if len(messages) <= 4:
        return messages
    history = "\n".join(f"{m['role']}: {m['content']}" for m in messages[:-2])
    summary = client.messages.create(
        model="claude-haiku-4-5", max_tokens=512,
        messages=[{"role": "user", "content": f"Summarize concisely:\n{history}"}],
    ).content[0].text
    return [{"role": "user", "content": f"[Prior context]\n{summary}"}] + messages[-2:]

Structured Outputs

Pydantic binding with `instructor` (recommended)

import anthropic
import instructor
from pydantic import BaseModel

class Entity(BaseModel):
    name: str
    type: str       # person | org | location | concept
    description: str

class ExtractionResult(BaseModel):
    entities: list[Entity]
    summary: str

# instructor patches the client — returns validated Pydantic models
client = instructor.from_anthropic(anthropic.Anthropic())

result: ExtractionResult = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": f"Extract all entities from:\n{text}"}],
    response_model=ExtractionResult,
)
print(result.entities[0].name)   # fully typed, validated

Schema enforcement without instructor (TypeScript)

import Anthropic from "@anthropic-ai/sdk";
import { z } from "zod";

const client = new Anthropic();

const EntitySchema = z.object({
  entities: z.array(z.object({ name: z.string(), type: z.string() })),
  summary: z.string(),
});

const response = await client.messages.create({
  model: "claude-sonnet-4-6",
  max_tokens: 1024,
  messages: [{
    role: "user",
    content: `Extract entities. Respond ONLY with valid JSON matching this schema:
{"entities": [{"name": string, "type": string}], "summary": string}

Text: ${inputText}`,
  }],
});

const parsed = EntitySchema.parse(JSON.parse(response.content[0].text));

Tool Calling (Function Calling)

Parallel tool calls + agentic loop (TypeScript)

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

const tools: Anthropic.Tool[] = [
  { name: "search_web", description: "Search the web",
    input_schema: { type: "object" as const, properties: { query: { type: "string" } }, required: ["query"] } },
  { name: "read_file", description: "Read a local file",
    input_schema: { type: "object" as const, properties: { path: { type: "string" } }, required: ["path"] } },
];

async function runAgentLoop(userMessage: string): Promise<string> {
  const messages: Anthropic.MessageParam[] = [{ role: "user", content: userMessage }];

  while (true) {
    const response = await client.messages.create({
      model: "claude-sonnet-4-6", max_tokens: 4096,
      tools, tool_choice: { type: "auto" },  // or { type: "tool", name: "search_web" }
      messages,
    });

    if (response.stop_reason === "end_turn") {
      return response.content.filter((b) => b.type === "text").map((b) => b.text).join("");
    }

    // Claude may call multiple tools in parallel — handle all at once
    const toolUses = response.content.filter((b) => b.type === "tool_use");
    const toolResults = await Promise.all(
      toolUses.map(async (tu) => ({
        type: "tool_result" as const,
        tool_use_id: (tu as Anthropic.ToolUseBlock).id,
        content: await executeTool((tu as Anthropic.ToolUseBlock).name, (tu as Anthropic.ToolUseBlock).input),
      }))
    );

    messages.push({ role: "assistant", content: response.content });
    messages.push({ role: "user", content: toolResults });
  }
}

Streaming Responses

Python streaming

import anthropic

client = anthropic.Anthropic()

# Stream text
with client.messages.stream(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": prompt}],
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)
    final = stream.get_final_message()
    print(f"\n[{final.usage.input_tokens} in / {final.usage.output_tokens} out tokens]")

TypeScript streaming

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

const stream = await client.messages.create({
  model: "claude-opus-4-6",
  max_tokens: 4096,
  stream: true,
  messages: [{ role: "user", content: prompt }],
});

for await (const chunk of stream) {
  if (chunk.type === "content_block_delta" && chunk.delta.type === "text_delta") {
    process.stdout.write(chunk.delta.text);
  }
}

Next.js SSE streaming route

// app/api/chat/route.ts
import Anthropic from "@anthropic-ai/sdk";
import { NextRequest } from "next/server";

const client = new Anthropic();

export async function POST(req: NextRequest) {
  const { messages } = await req.json();
  const stream = await client.messages.create({
    model: "claude-sonnet-4-6",
    max_tokens: 2048,
    stream: true,
    messages,
  });

  const encoder = new TextEncoder();
  const readable = new ReadableStream({
    async start(controller) {
      for await (const chunk of stream) {
        if (chunk.type === "content_block_delta" && chunk.delta.type === "text_delta") {
          controller.enqueue(encoder.encode(chunk.delta.text));
        }
      }
      controller.close();
    },
  });

  return new Response(readable, {
    headers: { "Content-Type": "text/plain; charset=utf-8" },
  });
}

Multi-Provider Routing (LiteLLM)

from litellm import completion, Router

# Provider-agnostic call — same interface for Claude, OpenAI, Gemini
def llm_call(messages: list[dict], model: str = "claude-sonnet-4-6") -> str:
    response = completion(
        model=model,       # "claude-sonnet-4-6" | "gpt-4o" | "gemini/gemini-1.5-pro"
        messages=messages,
        max_tokens=1024,
    )
    return response.choices[0].message.content

# Automatic fallback: try primary model, fall back on error
response = completion(
    model="claude-opus-4-6",
    messages=messages,
    fallbacks=["claude-sonnet-4-6", "gpt-4o"],
    max_tokens=1024,
)

# Cost-aware routing: route by quality tier
router = Router(model_list=[
    {"model_name": "fast",    "litellm_params": {"model": "claude-haiku-4-5"}},
    {"model_name": "smart",   "litellm_params": {"model": "claude-sonnet-4-6"}},
    {"model_name": "premium", "litellm_params": {"model": "claude-opus-4-6"}},
])

# Pick tier based on task complexity
tier = "fast" if simple_task else "smart"
response = router.completion(model=tier, messages=messages)
print(response.choices[0].message.content)

Prompt Versioning

import hashlib

# Version-pinned prompt registry — pin versions to prevent silent regressions
PROMPTS = {
    "summarize:v1": "Summarize in {max_words} words:\n{text}",
    "summarize:v2": "Create a {max_words}-word summary focusing on key decisions:\n{text}",
}

def run_prompt(key: str, **kwargs) -> str:
    template = PROMPTS[key]
    hash_id = hashlib.sha256(template.encode()).hexdigest()[:8]
    # Log key + hash for reproducibility and A/B analysis
    print(f"[prompt] key={key} hash={hash_id}")
    return llm_call([{"role": "user", "content": template.format(**kwargs)}])

Multi-Agent Orchestration

Orchestrator pattern (Python)

import anthropic, asyncio

client = anthropic.Anthropic()

AGENTS = {
    "planner":     "Break this task into subtasks. Output JSON: {\"research_tasks\": [], \"code_tasks\": []}",
    "researcher":  "Research the provided topics. Be concise and factual.",
    "coder":       "Write clean, tested Python code for the provided specs.",
    "synthesizer": "Combine these results into a final cohesive answer.",
}

def call_agent(role: str, content: str, model: str = "claude-sonnet-4-6") -> str:
    resp = client.messages.create(
        model=model,
        max_tokens=2048,
        system=AGENTS[role],
        messages=[{"role": "user", "content": content}],
    )
    return resp.content[0].text

async def orchestrate(task: str) -> str:
    """Hub-and-spoke orchestrator: plan → parallel execute → synthesize."""
    import json
    plan = json.loads(call_agent("planner", task))

    research, code = await asyncio.gather(
        asyncio.to_thread(call_agent, "researcher", str(plan.get("research_tasks", []))),
        asyncio.to_thread(call_agent, "coder",      str(plan.get("code_tasks", []))),
    )
    return call_agent("synthesizer", f"Research:\n{research}\n\nCode:\n{code}")

Agent Memory Patterns

Medium-term: SQLite (cross-session)

import sqlite3, json

conn = sqlite3.connect("agent_memory.db")
conn.execute("CREATE TABLE IF NOT EXISTS memory (key TEXT PRIMARY KEY, value TEXT, updated_at TEXT)")

def remember(key: str, value: dict):
    conn.execute("INSERT OR REPLACE INTO memory VALUES (?, ?, datetime('now'))", [key, json.dumps(value)])
    conn.commit()

def recall(key: str) -> dict | None:
    row = conn.execute("SELECT value FROM memory WHERE key=?", [key]).fetchone()
    return json.loads(row[0]) if row else None

Long-term: Vector DB (semantic search / RAG)

from qdrant_client import QdrantClient
import anthropic

qdrant = QdrantClient(":memory:")
claude = anthropic.Anthropic()

def rag_query(query: str, context_collection: str = "memory") -> str:
    hits = qdrant.search(collection_name=context_collection,
                         query_vector=get_embedding(query), limit=5)
    context = "\n".join(h.payload["text"] for h in hits)
    response = claude.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        system=f"Answer using this context:\n{context}",
        messages=[{"role": "user", "content": query}],
    )
    return response.content[0].text

LLM Evaluation

Basic eval harness

import anthropic

def run_eval(cases: list[tuple[str, str]], system_prompt: str) -> dict:
    """cases: list of (input, expected_output) tuples."""
    client = anthropic.Anthropic()
    results = {"pass": 0, "fail": 0, "score": 0.0, "cases": []}
    for inp, expected in cases:
        actual = client.messages.create(
            model="claude-haiku-4-5", max_tokens=512,  # use cheap model for evals
            system=system_prompt,
            messages=[{"role": "user", "content": inp}],
        ).content[0].text.strip()
        passed = actual == expected
        results["pass" if passed else "fail"] += 1
        results["cases"].append({"input": inp, "actual": actual, "pass": passed})
    results["score"] = results["pass"] / len(cases)
    return results

LLM-as-judge

import json

def llm_judge(question: str, answer: str, rubric: str) -> dict:
    response = client.messages.create(
        model="claude-haiku-4-5",
        messages=[{"role": "user", "content": f"""Rate this answer 1-5.

Question: {question}
Answer: {answer}
Rubric: {rubric}

Output JSON: {{"score": int, "reasoning": str}}"""}],
        max_tokens=256,
    )
    return json.loads(response.content[0].text)

Production Deployment

Error handling and retries

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({ maxRetries: 3, timeout: 60_000 });

try {
  const response = await client.messages.create({ /* ... */ });
} catch (err) {
  if (err instanceof Anthropic.RateLimitError) {
    const retryAfter = Number(err.headers?.["retry-after"] ?? 30) * 1000;
    await new Promise((r) => setTimeout(r, retryAfter));
  } else if (err instanceof Anthropic.APIConnectionError) {
    // network issue — SDK will auto-retry up to maxRetries
  }
}

Cost tracking

def track_cost(response: anthropic.types.Message) -> float:
    PRICES = {
        "claude-opus-4-6":    (0.015, 0.075),   # (input, output) per 1k tokens
        "claude-sonnet-4-6":  (0.003, 0.015),
        "claude-haiku-4-5":   (0.00025, 0.00125),
    }
    model = response.model
    if model not in PRICES:
        return 0.0
    in_cost  = response.usage.input_tokens  / 1000 * PRICES[model][0]
    out_cost = response.usage.output_tokens / 1000 * PRICES[model][1]
    return in_cost + out_cost

Prompt Engineering

Chain-of-thought: Prefix with Think step by step: or enumerate reasoning steps before the answer.
Output format pinning: Specify format in system prompt AND show a concrete example. Never rely on defaults for structured data.
Temperature: 0 = deterministic (evals, extraction) | 0.3-0.7 = balanced | 1.0 = creative/diverse.

Related Skills

temporal-testing — test async agent workflows
browser-automation — give agents web browsing capability
frontend-design — build AI-powered Next.js UIs
data-analysis-report — agent-driven data analysis pipelines
llm-observability — trace and monitor LLM calls in production

Comments

Loading comments...