Local CLI TTS
Local CLI TTS plugin for OpenClaw — use any command-line TTS tool as a speech provider
Install
openclaw plugins install clawhub:openclaw-tts-cliopenclaw-tts-cli
Use any command-line TTS tool as an OpenClaw speech provider.
Install
openclaw plugins install openclaw-tts-cli
Configuration
There are two ways to configure this plugin, depending on your OpenClaw version.
Option 1: messages.tts.providers (recommended, OpenClaw 2026.4+)
Add to your OpenClaw config:
{
"messages": {
"tts": {
"provider": "cli",
"providers": {
"cli": {
"command": "/usr/local/bin/your-tts-tool",
"args": ["--text", "{{Text}}", "--output", "{{OutputPath}}"],
"outputFormat": "mp3",
"timeoutMs": 120000,
"cwd": "/tmp",
"env": {
"API_KEY": "your-key"
}
}
}
}
}
}
Option 2: Plugin config (compatible with all versions)
If your OpenClaw version does not support messages.tts.providers, configure
through the plugin entry instead:
{
"plugins": {
"entries": {
"tts-local-cli": {
"enabled": true,
"config": {
"command": "/usr/local/bin/your-tts-tool",
"args": ["--text", "{text}", "--output", "{outputPath}"],
"outputFormat": "mp3",
"timeoutMs": 120000,
"cwd": "/tmp",
"env": {
"API_KEY": "your-key"
}
}
}
}
}
}
Priority:
messages.tts.providers.clitakes precedence over plugin config. If both are set, the plugin config is ignored.
Template placeholders
| Placeholder | Description |
|---|---|
{{Text}} or {text} | The text to synthesize (emoji-stripped) |
{{OutputPath}} or {outputPath} | Full path for the output audio file |
{{OutputDir}} or {outputDir} | Directory for output files |
{{OutputBase}} or {outputBase} | Base filename prefix (no extension) |
{filePrefix} | Alias for OutputBase |
Placeholders are case-insensitive. Double-brace style allows spaces: {{ text }} works too.
If no {{Text}} or {text} placeholder is present in args, the text is piped to the CLI via stdin.
How it works
- The plugin spawns your configured CLI command with template-substituted args.
- It looks for an audio file in the output directory (wav, mp3, opus, ogg, m4a).
- If no file is found, it reads stdout as audio data.
- The audio is converted to the desired format using ffmpeg (via OpenClaw's SDK).
- For telephony, audio is converted to raw 16kHz mono PCM.
Requirements
- ffmpeg must be available (OpenClaw typically bundles or resolves it)
- Node 22+
- A TTS CLI tool of your choice (e.g.,
piper,espeak,say,mlx-audio, etc.)
Examples
macOS say command
{
"messages": {
"tts": {
"provider": "cli",
"providers": {
"cli": {
"command": "say",
"args": ["-o", "{{OutputPath}}", "{{Text}}"],
"outputFormat": "wav"
}
}
}
}
}
Piper TTS
{
"messages": {
"tts": {
"provider": "cli",
"providers": {
"cli": {
"command": "piper",
"args": ["--model", "/path/to/model.onnx", "--output_file", "{{OutputPath}}", "--text", "{{Text}}"],
"outputFormat": "wav"
}
}
}
}
}
MLX-Audio (Qwen3-TTS)
Uses Apple Silicon GPU acceleration via MLX for local neural TTS.
Plugin config style (compatible with all versions):
{
"plugins": {
"entries": {
"tts-local-cli": {
"enabled": true,
"config": {
"command": "python3 -m mlx_audio.tts.generate --model mlx-community/Qwen3-TTS-12Hz-1.7B-CustomVoice-bf16 --voice serena --lang_code zh --audio_format wav",
"args": ["--output_path", "{outputDir}", "--file_prefix", "{filePrefix}", "--text", "{text}"],
"outputFormat": "opus"
}
}
}
}
}
Or with messages.tts.providers (OpenClaw 2026.4+):
{
"messages": {
"tts": {
"provider": "cli",
"providers": {
"cli": {
"command": "python3 -m mlx_audio.tts.generate --model mlx-community/Qwen3-TTS-12Hz-1.7B-CustomVoice-bf16 --voice serena --lang_code zh --audio_format wav",
"args": ["--output_path", "{outputDir}", "--file_prefix", "{filePrefix}", "--text", "{text}"],
"outputFormat": "opus"
}
}
}
}
}
License
MIT
