Install
openclaw skills install gptsovits-ttsHigh-quality Chinese TTS using GPT-SoVITS v2 Pro+ — convert text to natural-sounding speech with voice cloning support.
openclaw skills install gptsovits-ttsA production-ready text-to-speech skill that connects to a local GPT-SoVITS v2 Pro+ API server. Converts Chinese text to natural-sounding speech with a cloned reference voice. Designed for voice response automation, content narration, and AI voice applications.
http://127.0.0.1:9880 (or set GPT_SOVITS_API_URL)axios| Component | File | Size |
|---|---|---|
| s1 | s1v3.ckpt | 148MB |
| s2 | s2Gv2ProPlus.pth | 191MB |
| BERT | chinese-roberta-wwm-ext-large | 621MB |
| CNHuBERT | chinese-hubert-base | 180MB |
| Speaker Verification | pretrained_eres2netv2w24s4ep4.ckpt | 103MB |
| Reference Audio | ref_audio.wav | ~10-30s clean recording |
cd /path/to/GPT-SoVITS-CPUFast
conda activate GPTSoVits
python api_v2.py -a 127.0.0.1 -p 9880
Place a clean .wav file (10-30 seconds of the target voice) at:
voice-clone/ref_audio.wav
const tts = require('./skills/voice-clone');
const mp3 = await tts.speak("你好,欢迎使用GPT-SoVITS语音合成。", "output.mp3");
// Returns: "output.mp3"
speak(text, outputPath, opts?)| Param | Type | Default | Description |
|---|---|---|---|
| text | string | required | Chinese text to synthesize |
| outputPath | string | required | Output .mp3 file path |
| opts.topK | number | 15 | Top-K sampling |
| opts.topP | number | 0.7 | Top-P sampling |
| opts.temperature | number | 0.5 | Sampling temperature |
| opts.speed | number | 1.0 | Speed factor |
| opts.seed | number | -1 | Random seed (-1 = random) |
Returns: Promise<string> — path to the generated MP3 file.
| Variable | Default | Description |
|---|---|---|
| GPT_SOVITS_API_URL | http://127.0.0.1:9880 | GPT-SoVITS API base URL |
| GPT_SOVITS_API_TIMEOUT | 300000 | API request timeout (ms) |
This skill is designed to be called from automation workflows:
MIT