Save upto 50% for model tokens: OpenAI GPT, Claude, Gemini, Qwen, Deepseek, Grok and more with one single key

Unified LLM Gateway - One API for 70+ AI models. Route to GPT, Claude, Gemini, Qwen, Deepseek, Grok and more with a single API key.

MIT-0 · Free to use, modify, and redistribute. No attribution required.

⭐ 5 · 1.2k · 2 current installs · 2 all-time installs

by@0xjordansg-yolo

duplicate of @AIsaDocs/openclaw-aisa-llm-router

MIT-0

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Benign

high confidence

✓

Purpose & Capability

The name/description, SKILL.md examples, README, and the included Python client all consistently implement a unified router to api.aisa.one and require a single AISA_API_KEY. Requested binaries (python3, curl) and the single env var are appropriate for this functionality.

ℹ

Instruction Scope

Runtime instructions and the client send user messages, images (URL or base64), and other payloads to https://api.aisa.one/v1. The instructions do not ask for unrelated files or other environment variables, but they do direct potentially sensitive data (prompts, images, function arguments) to an external service — a privacy/data-exposure consideration rather than a technical incoherence.

✓

Install Mechanism

There is no install spec (instruction-only with an included client script). Nothing in the package downloads or executes third-party code beyond the provided Python script, which uses standard library urllib. Low install risk.

✓

Credentials

Only a single credential (AISA_API_KEY) is required and is clearly the primary credential for the gateway — proportional to the described purpose. Note that this one key likely grants broad access and billing to many backend models, so leakage would be impactful.

✓

Persistence & Privilege

The skill does not request always:true, system-wide config changes, or other skills' credentials. It behaves as a normal, user-invocable skill and does not ask for elevated or persistent platform privileges.

Scan Findings in Context

[no_findings_detected] expected: Static pre-scan reported no injection signals. For an instruction-only skill with a provided client script that performs straightforward HTTP calls, the absence of findings is expected.

Assessment

This skill legitimately implements a unified LLM gateway and only needs a single AISA_API_KEY, but keep in mind: using it will send your prompts, any attached images (including base64 data), and function payloads to api.aisa.one (a third party). Do not send passwords, private keys, or highly sensitive PII through this gateway unless you trust the provider and have reviewed their data-retention and privacy policies. Use a dedicated API key, monitor billing/usage, rotate the key if leaked, and test with non-sensitive data first. If you need stronger guarantees (on-prem or contractual data handling), verify the provider (openclaw.ai / marketplace.aisa.one) and their security/privacy documentation before production use.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.0.1

Download zip

latestvk979z0azwapm5x5bx8420nryr180my07

License

MIT-0

Free to use, modify, and redistribute. No attribution required.

Termshttps://spdx.org/licenses/MIT-0.html

Runtime requirements

🧠 Clawdis

Binscurl, python3

EnvAISA_API_KEY

Primary envAISA_API_KEY

SKILL.md

OpenClaw LLM Router 🧠

Unified LLM Gateway for autonomous agents. Powered by AIsa.

One API key. 70+ models. OpenAI-compatible.

Replace 100+ API keys with one. Access GPT-4, Claude-3, Gemini, Qwen, Deepseek, Grok, and more through a unified, OpenAI-compatible API.

🔥 What Can You Do?

Multi-Model Chat

"Chat with GPT-4 for reasoning, switch to Claude for creative writing"

Model Comparison

"Compare responses from GPT-4, Claude, and Gemini for the same question"

Vision Analysis

"Analyze this image with GPT-4o - what objects are in it?"

Cost Optimization

"Route simple queries to fast/cheap models, complex queries to GPT-4"

Fallback Strategy

"If GPT-4 fails, automatically try Claude, then Gemini"

Why LLM Router?

Feature	LLM Router	Direct APIs
API Keys	1	10+
SDK Compatibility	OpenAI SDK	Multiple SDKs
Billing	Unified	Per-provider
Model Switching	Change string	Code rewrite
Fallback Routing	Built-in	DIY
Cost Tracking	Unified	Fragmented

Supported Model Families

Family	Developer	Example Models
GPT	OpenAI	gpt-4.1, gpt-4o, gpt-4o-mini, o1, o1-mini, o3-mini
Claude	Anthropic	claude-3-5-sonnet, claude-3-opus, claude-3-sonnet
Gemini	Google	gemini-2.0-flash, gemini-1.5-pro, gemini-1.5-flash
Qwen	Alibaba	qwen-max, qwen-plus, qwen2.5-72b-instruct
Deepseek	Deepseek	deepseek-chat, deepseek-coder, deepseek-v3, deepseek-r1
Grok	xAI	grok-2, grok-beta

Note: Model availability may vary. Check marketplace.aisa.one/pricing for the full list of currently available models and pricing.

Quick Start

export AISA_API_KEY="your-key"

API Endpoints

OpenAI-Compatible Chat Completions

POST https://api.aisa.one/v1/chat/completions

Request

curl -X POST "https://api.aisa.one/v1/chat/completions" \
  -H "Authorization: Bearer $AISA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4.1",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Explain quantum computing in simple terms."}
    ],
    "temperature": 0.7,
    "max_tokens": 1000
  }'

Parameters

Parameter	Type	Required	Description
`model`	string	Yes	Model identifier (e.g., `gpt-4.1`, `claude-3-sonnet`)
`messages`	array	Yes	Conversation messages
`temperature`	number	No	Randomness (0-2, default: 1)
`max_tokens`	integer	No	Maximum response tokens
`stream`	boolean	No	Enable streaming (default: false)
`top_p`	number	No	Nucleus sampling (0-1)
`frequency_penalty`	number	No	Frequency penalty (-2 to 2)
`presence_penalty`	number	No	Presence penalty (-2 to 2)
`stop`	string/array	No	Stop sequences

Message Format

{
  "role": "user|assistant|system",
  "content": "message text or array for multimodal"
}

Response

{
  "id": "chatcmpl-xxx",
  "object": "chat.completion",
  "created": 1234567890,
  "model": "gpt-4.1",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Quantum computing uses..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 50,
    "completion_tokens": 200,
    "total_tokens": 250,
    "cost": 0.0025
  }
}

Streaming Response

curl -X POST "https://api.aisa.one/v1/chat/completions" \
  -H "Authorization: Bearer $AISA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-3-sonnet",
    "messages": [{"role": "user", "content": "Write a poem about AI."}],
    "stream": true
  }'

Streaming returns Server-Sent Events (SSE):

data: {"id":"chatcmpl-xxx","choices":[{"delta":{"content":"In"}}]}
data: {"id":"chatcmpl-xxx","choices":[{"delta":{"content":" circuits"}}]}
...
data: [DONE]

Vision / Image Analysis

Analyze images by passing image URLs or base64 data:

curl -X POST "https://api.aisa.one/v1/chat/completions" \
  -H "Authorization: Bearer $AISA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {
        "role": "user",
        "content": [
          {"type": "text", "text": "What is in this image?"},
          {"type": "image_url", "image_url": {"url": "https://example.com/image.jpg"}}
        ]
      }
    ]
  }'

Function Calling

Enable tools/functions for structured outputs:

curl -X POST "https://api.aisa.one/v1/chat/completions" \
  -H "Authorization: Bearer $AISA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4.1",
    "messages": [{"role": "user", "content": "What is the weather in Tokyo?"}],
    "functions": [
      {
        "name": "get_weather",
        "description": "Get current weather for a location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {"type": "string", "description": "City name"},
            "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
          },
          "required": ["location"]
        }
      }
    ],
    "function_call": "auto"
  }'

Google Gemini Format

For Gemini models, you can also use the native format:

POST https://api.aisa.one/v1/models/{model}:generateContent

curl -X POST "https://api.aisa.one/v1/models/gemini-2.0-flash:generateContent" \
  -H "Authorization: Bearer $AISA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [{"text": "Explain machine learning."}]
      }
    ],
    "generationConfig": {
      "temperature": 0.7,
      "maxOutputTokens": 1000
    }
  }'

Python Client

Installation

No installation required - uses standard library only.

CLI Usage

# Basic completion
python3 {baseDir}/scripts/llm_router_client.py chat --model gpt-4.1 --message "Hello, world!"

# With system prompt
python3 {baseDir}/scripts/llm_router_client.py chat --model claude-3-sonnet --system "You are a poet" --message "Write about the moon"

# Streaming
python3 {baseDir}/scripts/llm_router_client.py chat --model gpt-4o --message "Tell me a story" --stream

# Multi-turn conversation
python3 {baseDir}/scripts/llm_router_client.py chat --model qwen-max --messages '[{"role":"user","content":"Hi"},{"role":"assistant","content":"Hello!"},{"role":"user","content":"How are you?"}]'

# Vision analysis
python3 {baseDir}/scripts/llm_router_client.py vision --model gpt-4o --image "https://example.com/image.jpg" --prompt "Describe this image"

# List supported models
python3 {baseDir}/scripts/llm_router_client.py models

# Compare models
python3 {baseDir}/scripts/llm_router_client.py compare --models "gpt-4.1,claude-3-sonnet,gemini-2.0-flash" --message "What is 2+2?"

Python SDK Usage

from llm_router_client import LLMRouterClient

client = LLMRouterClient()  # Uses AISA_API_KEY env var

# Simple chat
response = client.chat(
    model="gpt-4.1",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response["choices"][0]["message"]["content"])

# With options
response = client.chat(
    model="claude-3-sonnet",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain relativity."}
    ],
    temperature=0.7,
    max_tokens=500
)

# Streaming
for chunk in client.chat_stream(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Write a story."}]
):
    print(chunk, end="", flush=True)

# Vision
response = client.vision(
    model="gpt-4o",
    image_url="https://example.com/image.jpg",
    prompt="What's in this image?"
)

# Compare models
results = client.compare_models(
    models=["gpt-4.1", "claude-3-sonnet", "gemini-2.0-flash"],
    message="Explain quantum computing"
)
for model, result in results.items():
    print(f"{model}: {result['response'][:100]}...")

Use Cases

1. Cost-Optimized Routing

Use cheaper models for simple tasks:

def smart_route(message: str) -> str:
    # Simple queries -> fast/cheap model
    if len(message) < 50:
        model = "gpt-3.5-turbo"
    # Complex reasoning -> powerful model
    else:
        model = "gpt-4.1"
    
    return client.chat(model=model, messages=[{"role": "user", "content": message}])

2. Fallback Strategy

Automatic fallback on failure:

def chat_with_fallback(message: str) -> str:
    models = ["gpt-4.1", "claude-3-sonnet", "gemini-2.0-flash"]
    
    for model in models:
        try:
            return client.chat(model=model, messages=[{"role": "user", "content": message}])
        except Exception:
            continue
    
    raise Exception("All models failed")

3. Model A/B Testing

Compare model outputs:

results = client.compare_models(
    models=["gpt-4.1", "claude-3-opus"],
    message="Analyze this quarterly report..."
)

# Log for analysis
for model, result in results.items():
    log_response(model=model, latency=result["latency"], cost=result["cost"])

4. Specialized Model Selection

Choose the best model for each task:

MODEL_MAP = {
    "code": "deepseek-coder",
    "creative": "claude-3-opus",
    "fast": "gpt-3.5-turbo",
    "vision": "gpt-4o",
    "chinese": "qwen-max",
    "reasoning": "gpt-4.1"
}

def route_by_task(task_type: str, message: str) -> str:
    model = MODEL_MAP.get(task_type, "gpt-4.1")
    return client.chat(model=model, messages=[{"role": "user", "content": message}])

Error Handling

Errors return JSON with error field:

{
  "error": {
    "code": "model_not_found",
    "message": "Model 'xyz' is not available"
  }
}

Common error codes:

401 - Invalid or missing API key
402 - Insufficient credits
404 - Model not found
429 - Rate limit exceeded
500 - Server error

Best Practices

Use streaming for long responses to improve UX
Set max_tokens to control costs
Implement fallback for production reliability
Cache responses for repeated queries
Monitor usage via response metadata
Use appropriate models - don't use GPT-4 for simple tasks

OpenAI SDK Compatibility

Just change the base URL and key:

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["AISA_API_KEY"],
    base_url="https://api.aisa.one/v1"
)

response = client.chat.completions.create(
    model="gpt-4.1",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

Pricing

Token-based pricing varies by model. Check marketplace.aisa.one/pricing for current rates.

Model Family	Approximate Cost
GPT-4.1 / GPT-4o	~$0.01 / 1K tokens
Claude-3-Sonnet	~$0.01 / 1K tokens
Gemini-2.0-Flash	~$0.001 / 1K tokens
Qwen-Max	~$0.005 / 1K tokens
DeepSeek-V3	~$0.002 / 1K tokens

Every response includes usage.cost and usage.credits_remaining.

Get Started

Sign up at aisa.one
Get your API key from the dashboard
Add credits (pay-as-you-go)
Set environment variable: export AISA_API_KEY="your-key"

Full API Reference

See API Reference for complete endpoint documentation.

Files

3 total

Select a file

Select a file to preview.

Comments

Loading comments…