Gemini Voice Assistant

Voice-to-voice AI assistant using Gemini Live API. Speak to the AI and get spoken responses. Use when you want to have natural voice conversations with an AI assistant powered by Google's Gemini models.

Ali Mostafa Radwan@alimostafaradwan

Install

openclaw skills install @alimostafaradwan/gemini-voice-assistant

Gemini Voice Assistant

A voice-to-voice AI assistant powered by Google's Gemini Live API. Speak to the AI and it responds with natural-sounding voice.

Usage

Text Mode

bash

cd ~/.openclaw/agents/kashif/skills/gemini-assistant && python3 handler.py "Your question or message"

Voice Mode

bash

cd ~/.openclaw/agents/kashif/skills/gemini-assistant && python3 handler.py --audio /path/to/audio.ogg "optional context"

Response Format

The handler returns a JSON response:

json

{
  "message": "[[audio_as_voice]]\nMEDIA:/tmp/gemini_voice_xxx.ogg",
  "text": "Text response from Gemini"
}

Configuration

Set your Gemini API key:

bash

export GEMINI_API_KEY="your-api-key-here"

Or create a .env file in the skill directory:

text

GEMINI_API_KEY=your-api-key-here

Model Options

The default model is gemini-2.5-flash-native-audio-preview-12-2025 for audio support.

To use a different model, edit handler.py:

python

MODEL = "gemini-2.0-flash-exp"  # For text-only

Requirements

google-genai>=1.0.0
numpy>=1.24.0
soundfile>=0.12.0
librosa>=0.10.0 (for audio input)
FFmpeg (for audio conversion)

Features

🎙️ Voice input/output support
💬 Text conversations
🔧 Configurable system instructions
⚡ Fast responses with Gemini Flash

Gemini Voice Assistant

Install

Gemini Voice Assistant

Usage

Text Mode

Voice Mode

Response Format

Configuration

Model Options

Requirements

Features

Related skills