---
name: gemini
description: Call the Gemini API (gemini-2.5-flash, gemini-2.5-pro, gemini-3-flash-preview, gemini-3-pro-preview, gemini-3.1-pro-preview) through RunAPI using the official OpenAI SDK or the native Google Generative AI SDK. Use when the user asks for Gemini chat, streaming completions, multimodal vision input, Google Search grounding, structured output, reasoning effort, or to point an existing OpenAI-compatible or Gemini-native client at RunAPI as the base URL.
documentation: https://runapi.ai/models/gemini.md
provider_page: https://runapi.ai/providers/google.md
catalog: https://runapi.ai/models.md
metadata:
  openclaw:
    homepage: https://runapi.ai/models/gemini
    primaryEnv: RUNAPI_TOKEN
    requires:
      env:
      - RUNAPI_TOKEN
    envVars:
    - name: RUNAPI_TOKEN
      required: true
      description: RunAPI API key used for Gemini requests.
    - name: GOOGLE_API_KEY
      required: false
      description: Optional alias when using native Gemini SDK examples.
    - name: GOOGLE_GENAI_BASE_URL
      required: false
      description: Optional native Gemini SDK base URL override for RunAPI.
---

# Gemini on RunAPI

Gemini on RunAPI exposes **two protocols**:

| Protocol | Endpoint | Use when |
|---|---|---|
| OpenAI-compatible | `POST /v1beta/openai/chat/completions` | You already use the OpenAI SDK or any OpenAI client |
| Native Gemini | `POST /v1beta/models/<model>:streamGenerateContent` | You use Google's `@google/generative-ai` SDK (currently `gemini-3-flash-preview` only) |

Both accept the same RunAPI API Key.

## Setup

```dotenv
RUNAPI_TOKEN=YOUR_RUNAPI_TOKEN
```

Get a RunAPI API Key at <https://runapi.ai/api_keys>.

### OpenAI-compatible setup

```python
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_RUNAPI_TOKEN",
    base_url="https://runapi.ai/v1beta/openai",
)
```

```typescript
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "YOUR_RUNAPI_TOKEN",
  baseURL: "https://runapi.ai/v1beta/openai",
});
```

### Native Gemini setup

```bash
export GOOGLE_API_KEY=YOUR_RUNAPI_TOKEN
export GOOGLE_GENAI_BASE_URL=https://runapi.ai
```

## Core recipe — OpenAI-compatible

```python
response = client.chat.completions.create(
    model="gemini-2.5-flash",
    messages=[{"role": "user", "content": "Explain quantum computing simply."}],
    reasoning_effort="high",
)
print(response.choices[0].message.content)
print(response.usage)
```

```typescript
const response = await client.chat.completions.create({
  model: "gemini-2.5-flash",
  messages: [{ role: "user", content: "Explain quantum computing simply." }],
});
```

```bash
curl -X POST "https://runapi.ai/v1beta/openai/chat/completions" \
  -H "x-api-key: YOUR_RUNAPI_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-2.5-flash",
    "messages": [{"role": "user", "content": "Explain quantum computing simply."}]
  }'
```

## Core recipe — native Gemini

```bash
curl -X POST \
  "https://runapi.ai/v1beta/models/gemini-3-flash-preview:streamGenerateContent" \
  -H "x-goog-api-key: YOUR_RUNAPI_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [
      { "role": "user", "parts": [{ "text": "Hello!" }] }
    ]
  }'
```

The native protocol returns SSE chunks in Google's `streamGenerateContent`
format — use the official `@google/generative-ai` SDK or Google's
`google-genai` Python package to consume it.

## Streaming (OpenAI-compatible)

```python
stream = client.chat.completions.create(
    model="gemini-2.5-flash",
    messages=[{"role": "user", "content": "Write a haiku about coding."}],
    stream=True,
)
for chunk in stream:
    delta = chunk.choices[0].delta.content
    if delta:
        print(delta, end="", flush=True)
```

Streaming runs through a regional edge proxy so the request does not hold a
Rails/Puma thread. Long generations should always stream.

## Vision / multimodal

```json
{
  "model": "gemini-2.5-flash",
  "messages": [
    {
      "role": "user",
      "content": [
        { "type": "text", "text": "What is in this image?" },
        { "type": "image_url", "image_url": { "url": "https://example.com/img.jpg" } }
      ]
    }
  ]
}
```

Standard OpenAI multimodal block for the OpenAI-compatible endpoint. For the
native endpoint, embed image data as `parts[].inlineData` or `parts[].fileData`.

## Google Search grounding

```json
{
  "model": "gemini-2.5-pro",
  "messages": [
    { "role": "user", "content": "Latest news on Gemini 3." }
  ],
  "tools": [
    { "type": "function", "function": { "name": "googleSearch" } }
  ]
}
```

Available on `gemini-2.5-flash`, `gemini-2.5-pro`, `gemini-3.1-pro-preview`,
and `gemini-3-pro-preview`.

## Structured output

```json
{
  "model": "gemini-2.5-flash",
  "messages": [{ "role": "user", "content": "Give me one person object." }],
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "person",
      "schema": {
        "type": "object",
        "properties": { "name": { "type": "string" }, "age": { "type": "integer" } },
        "required": ["name", "age"]
      }
    }
  }
}
```

## Reasoning effort

Supported on `gemini-2.5-pro`, `gemini-3.1-pro-preview`, `gemini-3-pro-preview`,
and `gemini-3-flash-preview` — pass `reasoning_effort: "low" | "medium" | "high"`.

## List models

```bash
curl https://runapi.ai/v1beta/models -H "x-api-key: YOUR_RUNAPI_TOKEN"
```

Or via the OpenAI-style path:

```bash
curl https://runapi.ai/v1beta/openai/models \
  -H "Authorization: Bearer YOUR_RUNAPI_TOKEN"
```

## Supported models

| Model ID | OpenAI endpoint | Native endpoint | Capabilities |
|---|---|---|---|
| `gemini-2.5-flash` | yes | — | Chat, multimodal, Google Search, structured output, thoughts |
| `gemini-2.5-pro` | yes | — | + reasoning effort |
| `gemini-3.1-pro-preview` | yes | — | + reasoning effort |
| `gemini-3-pro-preview` | yes | — | + reasoning effort |
| `gemini-3-flash-preview` | yes | `:streamGenerateContent` | Chat, multimodal, function calling, structured output, reasoning effort |

`gemini-flash-latest` resolves to `gemini-3-flash-preview`.

## Connect Gemini CLI itself

```bash
export GOOGLE_API_KEY=YOUR_RUNAPI_TOKEN
export GOOGLE_GENAI_BASE_URL=https://runapi.ai
gemini
```

## Agent rules

- The native `:streamGenerateContent` path is currently only wired for
  `gemini-3-flash-preview` — use the OpenAI-compatible endpoint for every
  other Gemini model.
- Use streaming for any response longer than a few hundred tokens. Do not
  hold the agent on a long blocking request.
- Google Search grounding uses a `googleSearch` function tool.
- Pricing, rate limits, quotas — link to <https://runapi.ai/models/gemini.md>,
  not this skill file.

## Routing

- Model page: <https://runapi.ai/models/gemini.md>
- Provider page: <https://runapi.ai/providers/google.md>
- Catalog: <https://runapi.ai/models.md>