Install
openclaw skills install alicloud-ai-audio-ttsGenerate human-like speech audio with Model Studio DashScope Qwen TTS models (qwen3-tts-flash, qwen3-tts-instruct-flash). Use when converting text to speech, producing voice lines for short drama/news videos, or documenting TTS request/response fields for DashScope.
openclaw skills install alicloud-ai-audio-ttsCategory: provider
mkdir -p output/alicloud-ai-audio-tts
python -m py_compile skills/ai/audio/alicloud-ai-audio-tts/scripts/generate_tts.py && echo "py_compile_ok" > output/alicloud-ai-audio-tts/validate.txt
Pass criteria: command exits 0 and output/alicloud-ai-audio-tts/validate.txt is generated.
output/alicloud-ai-audio-tts/.Use one of the recommended models:
qwen3-tts-flashqwen3-tts-instruct-flashqwen3-tts-instruct-flash-2026-01-26python3 -m venv .venv
. .venv/bin/activate
python -m pip install dashscope
DASHSCOPE_API_KEY in your environment, or add dashscope_api_key to ~/.alibabacloud/credentials (env takes precedence).text (string, required)voice (string, required)language_type (string, optional; default Auto)instruction (string, optional; recommended for instruct models)stream (bool, optional; default false)audio_url (string, when stream=false)audio_base64_pcm (string, when stream=true)sample_rate (int, 24000)format (string, wav or pcm depending on mode)import os
import dashscope
# Prefer env var for auth: export DASHSCOPE_API_KEY=...
# Or use ~/.alibabacloud/credentials with dashscope_api_key under [default].
# Beijing region; for Singapore use: https://dashscope-intl.aliyuncs.com/api/v1
dashscope.base_http_api_url = "https://dashscope.aliyuncs.com/api/v1"
text = "Hello, this is a short voice line."
response = dashscope.MultiModalConversation.call(
model="qwen3-tts-instruct-flash",
api_key=os.getenv("DASHSCOPE_API_KEY"),
text=text,
voice="Cherry",
language_type="English",
instruction="Warm and calm tone, slightly slower pace.",
stream=False,
)
audio_url = response.output.audio.url
print(audio_url)
stream=True returns Base64-encoded PCM chunks at 24kHz.finish_reason == "stop" when the stream ends.language_type consistent with the text to improve pronunciation.instruction only when you need explicit style/tone control.(text, voice, language_type) to avoid repeat costs.output/alicloud-ai-audio-tts/audio/OUTPUT_DIR.references/api_reference.md for parameter mapping and streaming example.
Realtime mode is provided by skills/ai/audio/alicloud-ai-audio-tts-realtime/.
Voice cloning/design are provided by skills/ai/audio/alicloud-ai-audio-tts-voice-clone/ and skills/ai/audio/alicloud-ai-audio-tts-voice-design/.
Source list: references/sources.md