Install
openclaw skills install flow-voiceClone any voice from a short audio sample and generate speech with it. Powered by LuxTTS (150x realtime, local, free, no API key). Use when asked to clone a...
openclaw skills install flow-voiceClone any voice from a 3–30 second audio sample and generate speech from text. Powered by LuxTTS — 150x realtime, runs locally, fits in 1GB VRAM, works on CPU and Apple Silicon MPS. No API key, no cloud, no cost.
Output directory: ~/clawd/output/voice/
| What you say | What it does |
|---|---|
| "clone this voice [audio file]" | Encode a voice profile from a sample |
| "speak as [name]: [text]" | Generate speech using a saved voice profile |
| "add voiceover to [video]: [text]" | Generate speech + bake into video with ffmpeg |
| "list voices" | Show saved voice profiles |
| "clone voice from URL [url]" | Download audio from URL, then clone |
uv run ~/clawd/skills/flow-voice/scripts/clone.py \
--sample /path/to/sample.wav \
--name "eric"
Saves encoded profile to ~/clawd/output/voice/profiles/eric.pkl.
Requires at least 3 seconds of clean audio. 10–30 seconds is ideal.
uv run ~/clawd/skills/flow-voice/scripts/speak.py \
--voice "eric" \
--text "Hello, this is a test of voice cloning." \
--output ~/clawd/output/voice/output.wav
Outputs 48kHz WAV. Use --speed 1.0 to adjust pace.
uv run ~/clawd/skills/flow-voice/scripts/speak.py \
--voice "eric" \
--text "Your agent can think. Now teach it to draw." \
--output /tmp/vo.wav
ffmpeg -i input.mp4 -i /tmp/vo.wav \
-c:v copy -c:a aac -shortest output_with_voice.mp4
uv run ~/clawd/skills/flow-voice/scripts/flow_voice.py \
--sample /path/to/sample.wav \
--text "Beautiful diagrams, from a single prompt." \
--output ~/clawd/output/voice/result.wav
No profile saving — just clone and speak immediately.
uv run ~/clawd/skills/flow-voice/scripts/flow_voice.py \
--sample /path/to/sample.wav \
--text "Your agent can think. Now teach it to draw." \
--video /path/to/animation.mp4 \
--output ~/clawd/output/voice/final_with_voice.mp4
| Flag | Default | Description |
|---|---|---|
--sample | required | Reference audio file (wav/mp3, min 3s) |
--text | required | Text to speak |
--output | auto-named | Output file path |
--video | none | If set, bakes audio into this video |
--voice | none | Use saved profile instead of --sample |
--name | none | Save cloned profile with this name |
--speed | 1.0 | Speech speed (0.8 = slower, 1.2 = faster) |
--steps | 4 | Inference steps (3–4 recommended) |
--t-shift | 0.9 | Sampling param (higher = potentially better quality) |
--smooth | false | Add smoothing (reduces metallic artifacts) |
--device | auto | Force cpu / mps / cuda |
--smoothmps automatically~/.cache/huggingface/Clone Eric's voice from a recording:
uv run ~/clawd/skills/flow-voice/scripts/flow_voice.py \
--sample ~/recordings/eric-30s.wav \
--name eric \
--text "FlowStay is live. Book your room with AI." \
--output ~/clawd/output/voice/flowstay-promo.wav
Add voiceover to a Flow Visual Explainer animation:
uv run ~/clawd/skills/flow-voice/scripts/flow_voice.py \
--voice eric \
--text "Your agent can think. Now teach it to draw." \
--video ~/clawd/2026-03-10-flowvisual-c3-magic-wand-comp.mp4 \
--output ~/clawd/output/voice/flowvisual-voiced.mp4
Quick one-shot from a downloaded audio clip:
yt-dlp -x --audio-format wav -o /tmp/ref.wav "https://www.instagram.com/reel/..."
uv run ~/clawd/skills/flow-voice/scripts/flow_voice.py \
--sample /tmp/ref.wav \
--text "Hello from OpenClaw." \
--output ~/clawd/output/voice/test.wav
Powered by LuxTTS (ysharma3501/LuxTTS, ZipVoice-based) — Free, local, no API key required. Packaged for OpenClaw by Flow — March 2026