Install
openclaw skills install voicecloneVoice cloning and TTS using MiniMax API. User must provide a voice name when cloning; after success, voice_name->voice_id is written back to this skill doc f...
openclaw skills install voicecloneThis skill is narrowly scoped to: (1) uploading clone audio to MiniMax, (2) creating a cloned voice, (3) TTS with cloned or existing voices, and (4) updating the cloned-voice mapping block in this SKILL.md. The script only reads/writes this skill’s SKILL.md; it does not read unrelated system files or other environment variables beyond the MiniMax API key(s) above.
POST /v1/files/upload, purpose=voice_clone, multipart/form-data.POST /v1/voice_clone.POST /v1/t2a_v2.mp3/m4a/wav, duration 10 seconds–5 minutes, file size <=20MB.requests library. Install with:
pip install -r requirements.txt
or pip install requests.SKILL.md to update the cloned-voice mapping block.At least one of the following must be set for MiniMax API authentication (see frontmatter requiredEnv / optionalEnv):
| Variable | Required | Notes |
|---|---|---|
MINIMAX_API_KEY | preferred | Primary API key |
MINIMAX_KEY | alternative | Accepted if set |
MINIMAX_GROUP_API_KEY | alternative | Accepted if set |
The script will fail with a clear error if none are set.
liuyang_narration_v1.cd workspace/skills/voice-clone-ttspython scripts/minimax_voice_clone_tts.py \
--audio "/absolute/path/to/voice.wav" \
--voice-name "yangtuo_demo_v1" \
--display-name "Alpaca Demo" \
--text "Hello, this is a cloned voice test." \
--output "./output/voice_test.mp3"
--text.# Resolve by display name
python scripts/minimax_voice_clone_tts.py \
--voice "voice_v2" \
--text "This is TTS using an existing cloned voice." \
--output "./output/reuse_voice.mp3"
# Or specify voice_id directly
python scripts/minimax_voice_clone_tts.py \
--voice-id "yangtuo_demo_v1" \
--text "This is TTS using an existing cloned voice." \
--output "./output/reuse_voice.mp3"
--audio: Path to clone audio (required for cloning).--voice-name: Required when cloning; API voice ID (letters, digits, underscores, e.g. yangtuo_demo_v1).--display-name: Optional when cloning; display name written to SKILL (e.g. Alpaca Demo). Defaults to --voice-name if omitted.--voice-id: For synthesis, specify API voice_id directly (skips mapping table).--voice: For synthesis, specify display name or voice_id; resolved from the mapping table below (e.g. voice_v2 or yangtuo_demo_v1).--text: Text to synthesize (omit for clone-only).--output: Output audio path (default ./output/minimax_tts.mp3).--model: Speech model (default speech-2.8-turbo).--format: Output format (mp3/pcm/flac/wav).--speed --vol --pitch --emotion: Speech expression parameters.--voice "display name" so you don't need to remember voice_id.--voice "display name" or --voice-id voice_id.test_voice_1772187110: test_voice_1772187110 (updated: 2026-02-27 18:12:00)voice_v1: shuangyue_test (updated: 2026-02-28 16:47:01)voice_v2: yangtuo_demo_v1 (updated: 2026-02-27 18:19:39)voice_v3: dong_yuhui_voice_v1 (updated: 2026-03-02 19:51:44)MINIMAX_API_KEY is correct.voice_name rules, audio format/size, and text length.SKILL.md exists and contains the write-back marker block.