Install
openclaw skills install tts-routerLocal TTS router for Apple Silicon — pull models, serve OpenAI-compatible API, synthesize speech, clone voices. Use when the user asks to "generate speech",...
openclaw skills install tts-routerA CLI that manages and serves multiple TTS models locally on Apple Silicon (MLX). Models are downloaded from HuggingFace Hub and served via OpenAI + DashScope compatible APIs.
uv installed — see https://docs.astral.sh/uv/getting-started/installation/
(e.g. brew install uv or via the official installer)brew install ffmpeg)# From PyPI (requires --prerelease=allow due to mlx-audio upstream dep)
uvx --prerelease=allow tts-router list
# Or install with pip
pip install tts-router
tts-router list — Show available modelstts-router list
tts-router pull <model> — Download model weightstts-router pull qwen3-tts
tts-router pull kokoro
Models are cached in ~/.cache/huggingface/hub/. No need to re-download.
tts-router serve — Start the TTS API server# Default: qwen3-tts on port 8091
tts-router serve
# Custom model and port
tts-router serve --model kokoro --port 9000
The server requires models to be pulled first.
tts-router say — Synthesize speech from CLItts-router say "Hello world" -o hello.wav
tts-router say "Hello" --voice Vivian --model kokoro -o out.wav
| Short Name | Features |
|---|---|
qwen3-tts | multi-speaker, emotion, instruct (default) |
qwen3-tts-design | free-form voice description |
qwen3-tts-clone | voice cloning with ref audio |
kokoro | fast, lightweight, multi-lang |
dia | multi-speaker dialogue, laughter/emotion sounds |
chatterbox | 23 languages, emotion control, voice cloning |
orpheus | emotive TTS with emotion tags |
# 1. Pull the default model
tts-router pull qwen3-tts
# 2. Start the server
tts-router serve
# 3. Generate speech (OpenAI format)
curl -X POST http://localhost:8091/v1/audio/speech \
-H "Content-Type: application/json" \
-d '{"input": "Hello world", "voice": "Vivian"}' \
--output output.wav
| Endpoint | Method | Description |
|---|---|---|
GET / | GET | Playground UI |
POST /v1/audio/speech | POST | OpenAI-compatible TTS |
GET /v1/audio/voices | GET | List available voices |
GET /health | GET | Health check |
POST /v1/audio/clone | POST | Voice clone generation |
POST /v1/audio/references/upload | POST | Upload reference audio |
POST /v1/audio/references/from-url | POST | Fetch ref audio by URL |
For more complex workflows, read the relevant reference file:
references/voice-cloning.mdreferences/openclaw.md