Install
openclaw skills install elevenlabs-toolkitElevenLabs voice API integration — TTS, sound effects, music generation, speech-to-text, voice isolation, and streaming. Use when building voice-enabled apps...
openclaw skills install elevenlabs-toolkitProgrammatic access to all 7 ElevenLabs API capabilities via FastAPI endpoints or standalone Python functions.
Use ElevenLabs when:
Do NOT use ElevenLabs when:
| Criteria | ElevenLabs | Local TTS (kokoro/chatterbox) |
|---|---|---|
| Voice quality | ★★★★★ — natural, expressive | ★★★ — good but robotic edges |
| Cost | Chars deducted from monthly quota | Free, unlimited |
| Latency | ~300–800ms API round-trip | ~50–200ms local inference |
| Voice consistency | Named voices (Rachel etc.) persist | Model-dependent |
| Offline use | ❌ Requires internet + API key | ✅ Fully local |
| Best for | Final narration, published content | Drafts, testing, high-volume batch |
Rule of thumb: Use ElevenLabs for anything that will be seen/heard by a user. Use local TTS for drafts, tests, and volume work.
| Tool | Endpoint | What It Does |
|---|---|---|
| Voices | GET /api/voices | Browse available voices with metadata |
| TTS | POST /api/voice/tts | Batch text-to-speech (any voice, any language) |
| TTS Stream | WS /api/voice/stream | Real-time WebSocket TTS streaming |
| Sound Effects | POST /api/voice/sfx | Generate ambient audio from text prompts |
| Music | POST /api/voice/music | Generate background music from descriptions |
| STT (Scribe) | POST /api/voice/stt | Transcribe audio with language detection |
| Voice Isolation | POST /api/voice/isolate | Extract clean voice from noisy audio |
These are confirmed voices used in OpenClaw workflows. Always prefer these over browsing the full list:
| Voice | Voice ID | Best For |
|---|---|---|
| Rachel | 21m00Tcm4TlvDq8ikWAM | Default narration — clear, warm, American English |
| Adam | pNInz6obpgDQGcFmaJgB | Male narration, authoritative tone |
| Domi | AZnzlk1XvdvUeBnXmlld | Energetic, conversational |
| Bella | EXAVITQu4vr4xnSDxMaL | Soft, gentle narration |
Default for all narration tasks: Use Rachel (
21m00Tcm4TlvDq8ikWAM) unless explicitly specified otherwise.
To get the full current list from the API:
curl -s -H "xi-api-key: $ELEVENLABS_API_KEY" https://api.elevenlabs.io/v1/voices | python3 -m json.tool
import httpx
BASE = "http://localhost:8000" # Your FastAPI app
KEY = os.environ["ELEVENLABS_API_KEY"]
# Get voices
voices = httpx.get(f"{BASE}/api/voices").json()
# Generate speech
audio = httpx.post(f"{BASE}/api/voice/tts", json={
"text": "Hello world",
"voice_id": voices[0]["voice_id"],
"model_id": "eleven_multilingual_v2"
}).content # Returns raw audio bytes
# Generate sound effects
sfx = httpx.post(f"{BASE}/api/voice/sfx", json={
"prompt": "ocean waves on a quiet beach at night"
}).content
TTS and SFX endpoints return raw audio bytes (not base64, not JSON).
# Correct: .content gives you bytes
audio_bytes = response.content # type: bytes
# Save to file
with open("output.mp3", "wb") as f:
f.write(audio_bytes)
# The file format is MP3 by default
# File size estimate: ~1 MB per minute of speech at standard quality
What you get back from each endpoint:
| Endpoint | Response type | How to handle |
|---|---|---|
| POST /api/voice/tts | bytes (MP3) | Write directly to .mp3 file |
| POST /api/voice/sfx | bytes (MP3) | Write directly to .mp3 file |
| POST /api/voice/music | bytes (MP3) | Write directly to .mp3 file |
| POST /api/voice/stt | JSON | {"text": "transcription...", "language": "en"} |
| POST /api/voice/isolate | bytes (MP3) | Write directly to .mp3 file |
| GET /api/voices | JSON | List of {voice_id, name, labels, ...} |
eleven_turbo_v2_5 — faster, no accent bleedeleven_multilingual_v2 — supports 29 languagesElevenLabs charges per character for TTS. Key patterns:
prompt-cache skill for SHA-256 dedup before calling TTSCopy scripts/elevenlabs_api.py into your FastAPI app and mount the router:
from elevenlabs_api import router
app.include_router(router)
Set ELEVENLABS_API_KEY in your environment. All endpoints handle errors gracefully with proper HTTP status codes.
The Quick Start examples assume http://localhost:8000 is live. If it's not:
# Check if server is up before calling
import httpx
try:
httpx.get("http://localhost:8000/health", timeout=2.0)
except httpx.ConnectError:
# Server is not running — start it first
import subprocess
subprocess.Popen(["uvicorn", "elevenlabs_api:app", "--port", "8000"])
import time; time.sleep(2) # Give it a moment to bind
Or call the ElevenLabs API directly without the FastAPI wrapper — the scripts/elevenlabs_api.py functions are importable standalone:
from elevenlabs_api import generate_tts # if the module exposes standalone functions
Missing API key:
httpx.HTTPStatusError: 401 Unauthorized
{"detail": {"status": "unauthorized", "message": "Invalid API key"}}
→ Check ELEVENLABS_API_KEY is set: echo $ELEVENLABS_API_KEY
→ Retrieve from 1Password: op read "op://OpenClaw/ElevenLabs API Credentials/credential"
Rate limited (429):
{"detail": {"status": "too_many_requests", "message": "Too many requests"}}
→ Wait and retry with exponential backoff. ElevenLabs rate limits are per-minute on the free/starter tiers. → On Creator tier and above, limits are much higher — check your tier in the ElevenLabs dashboard.
Quota exhausted:
{"detail": {"status": "quota_exceeded", "message": "Quota exceeded"}}
→ Character quota for the month is used up. Either wait for monthly reset or upgrade tier.
→ Check current usage: curl -s -H "xi-api-key: $KEY" https://api.elevenlabs.io/v1/user/subscription
scripts/elevenlabs_api.py — FastAPI router with all 7 endpointsTreating the response as JSON when it's bytes
response.json() on a TTS call → JSONDecodeErrorresponse.content → raw bytes, then write to .mp3Using the wrong voice ID
"voice_id": "Rachel" → 404 or wrong voice"voice_id": "21m00Tcm4TlvDq8ikWAM" (Rachel's actual ID)Calling TTS for large batches without caching
Using multilingual model for English-only content
eleven_multilingual_v2 is slower and can produce accent artifacts on English-only texteleven_turbo_v2_5 for English-only workNot checking the FastAPI server is running before calling
httpx.ConnectError is confusing if you forget the local server dependencyThis skill uses patterns that may trigger automated security scanners: