gemini

API key required
Data & APIs

Call the Gemini API (gemini-2.5-flash, gemini-2.5-pro, gemini-3-flash-preview, gemini-3-pro-preview, gemini-3.1-pro-preview) through RunAPI using the official OpenAI SDK or the native Google Generative AI SDK. Use when the user asks for Gemini chat, streaming completions, multimodal vision input, Google Search grounding, structured output, reasoning effort, or to point an existing OpenAI-compatible or Gemini-native client at RunAPI as the base URL.

Install

openclaw skills install runapi-gemini

Gemini on RunAPI

Gemini on RunAPI exposes two protocols:

ProtocolEndpointUse when
OpenAI-compatiblePOST /v1beta/openai/chat/completionsYou already use the OpenAI SDK or any OpenAI client
Native GeminiPOST /v1beta/models/<model>:streamGenerateContentYou use Google's @google/generative-ai SDK (currently gemini-3-flash-preview only)

Both accept the same RunAPI API Key.

Setup

RUNAPI_TOKEN=YOUR_RUNAPI_TOKEN

Get a RunAPI API Key at https://runapi.ai/api_keys.

OpenAI-compatible setup

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_RUNAPI_TOKEN",
    base_url="https://runapi.ai/v1beta/openai",
)
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "YOUR_RUNAPI_TOKEN",
  baseURL: "https://runapi.ai/v1beta/openai",
});

Native Gemini setup

export GOOGLE_API_KEY=YOUR_RUNAPI_TOKEN
export GOOGLE_GENAI_BASE_URL=https://runapi.ai

Core recipe — OpenAI-compatible

response = client.chat.completions.create(
    model="gemini-2.5-flash",
    messages=[{"role": "user", "content": "Explain quantum computing simply."}],
    reasoning_effort="high",
)
print(response.choices[0].message.content)
print(response.usage)
const response = await client.chat.completions.create({
  model: "gemini-2.5-flash",
  messages: [{ role: "user", content: "Explain quantum computing simply." }],
});
curl -X POST "https://runapi.ai/v1beta/openai/chat/completions" \
  -H "x-api-key: YOUR_RUNAPI_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-2.5-flash",
    "messages": [{"role": "user", "content": "Explain quantum computing simply."}]
  }'

Core recipe — native Gemini

curl -X POST \
  "https://runapi.ai/v1beta/models/gemini-3-flash-preview:streamGenerateContent" \
  -H "x-goog-api-key: YOUR_RUNAPI_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [
      { "role": "user", "parts": [{ "text": "Hello!" }] }
    ]
  }'

The native protocol returns SSE chunks in Google's streamGenerateContent format — use the official @google/generative-ai SDK or Google's google-genai Python package to consume it.

Streaming (OpenAI-compatible)

stream = client.chat.completions.create(
    model="gemini-2.5-flash",
    messages=[{"role": "user", "content": "Write a haiku about coding."}],
    stream=True,
)
for chunk in stream:
    delta = chunk.choices[0].delta.content
    if delta:
        print(delta, end="", flush=True)

Streaming runs through a regional edge proxy so the request does not hold a Rails/Puma thread. Long generations should always stream.

Vision / multimodal

{
  "model": "gemini-2.5-flash",
  "messages": [
    {
      "role": "user",
      "content": [
        { "type": "text", "text": "What is in this image?" },
        { "type": "image_url", "image_url": { "url": "https://example.com/img.jpg" } }
      ]
    }
  ]
}

Standard OpenAI multimodal block for the OpenAI-compatible endpoint. For the native endpoint, embed image data as parts[].inlineData or parts[].fileData.

Google Search grounding

{
  "model": "gemini-2.5-pro",
  "messages": [
    { "role": "user", "content": "Latest news on Gemini 3." }
  ],
  "tools": [
    { "type": "function", "function": { "name": "googleSearch" } }
  ]
}

Available on gemini-2.5-flash, gemini-2.5-pro, gemini-3.1-pro-preview, and gemini-3-pro-preview.

Structured output

{
  "model": "gemini-2.5-flash",
  "messages": [{ "role": "user", "content": "Give me one person object." }],
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "person",
      "schema": {
        "type": "object",
        "properties": { "name": { "type": "string" }, "age": { "type": "integer" } },
        "required": ["name", "age"]
      }
    }
  }
}

Reasoning effort

Supported on gemini-2.5-pro, gemini-3.1-pro-preview, gemini-3-pro-preview, and gemini-3-flash-preview — pass reasoning_effort: "low" | "medium" | "high".

List models

curl https://runapi.ai/v1beta/models -H "x-api-key: YOUR_RUNAPI_TOKEN"

Or via the OpenAI-style path:

curl https://runapi.ai/v1beta/openai/models \
  -H "Authorization: Bearer YOUR_RUNAPI_TOKEN"

Supported models

Model IDOpenAI endpointNative endpointCapabilities
gemini-2.5-flashyesChat, multimodal, Google Search, structured output, thoughts
gemini-2.5-proyes+ reasoning effort
gemini-3.1-pro-previewyes+ reasoning effort
gemini-3-pro-previewyes+ reasoning effort
gemini-3-flash-previewyes:streamGenerateContentChat, multimodal, function calling, structured output, reasoning effort

gemini-flash-latest resolves to gemini-3-flash-preview.

Connect Gemini CLI itself

export GOOGLE_API_KEY=YOUR_RUNAPI_TOKEN
export GOOGLE_GENAI_BASE_URL=https://runapi.ai
gemini

Agent rules

  • The native :streamGenerateContent path is currently only wired for gemini-3-flash-preview — use the OpenAI-compatible endpoint for every other Gemini model.
  • Use streaming for any response longer than a few hundred tokens. Do not hold the agent on a long blocking request.
  • Google Search grounding uses a googleSearch function tool.
  • Pricing, rate limits, quotas — link to https://runapi.ai/models/gemini.md, not this skill file.

Routing