Install
openclaw skills install azure-speech-ttsAzure Speech TTS skill for generating local audio files from text or SSML with Azure Speech. Use when the user asks to use Azure Speech / Azure TTS / Microsoft TTS / speech synthesis / text-to-speech / SSML, choose voices, control speaking rate/pitch/style, or export MP3/WAV/OGG/PCM audio.
openclaw skills install azure-speech-ttsUse Azure Speech to turn text or SSML into a local audio file under download/.
This skill uses a small default config file plus environment variables.
File:
config.jsonDefault values:
default_voice: zh-CN-Yunqi:DragonHDOmniLatestNeuraldefault_format: mp3default_output_dir: downloaddefault_timeout_seconds: 60Set these in the local shell environment:
AZURE_SPEECH_KEYAZURE_SPEECH_REGIONAZURE_SPEECH_VOICEAZURE_SPEECH_FORMATUse this order:
config.jsonpython3 scripts/azure_tts.py \
--text "你好,这是一段测试语音。" \
--voice zh-CN-Yunqi:DragonHDOmniLatestNeural \
--format mp3 \
--output download/test.mp3
For SSML:
python3 scripts/azure_tts.py \
--ssml-file temp/input.ssml \
--format wav \
--output download/test.wav
--text / --text-file for normal narration.--ssml / --ssml-file only when the payload already contains a complete <speak> document.config.json supply the defaults.scripts/azure_tts.py.--ssml input must include a full <speak> root element.zh-CN-Yunqi:DragonHDOmniLatestNeural if nothing else is set.download/.config.json.See references/azure-speech-cheatsheet.md for the format map and examples.
Short aliases supported by the script:
mp3wavpcmogg--voice: Azure voice name, for example en-US-AriaNeural--language: SSML xml:lang for plain-text mode--rate: speaking rate, for example +10%--pitch: pitch adjustment, for example +2st--style: expressive style such as cheerful, sad, chat--style-degree: strength of the expressive style--role: voice role when supported--save-ssml: write the generated SSML to a file for inspection--dry-run: print the generated SSML without calling AzureThe helper script writes the audio file and prints JSON like:
{
"ok": true,
"output_path": "download/test.mp3",
"format": "audio-24khz-48kbitrate-mono-mp3",
"voice": "zh-CN-Yunqi:DragonHDOmniLatestNeural",
"language": "zh-CN",
"bytes": 123456
}
Use the printed output_path as the deliverable path.