Install
openclaw skills install voice-edge-ttsClawHub Security found sensitive or high-impact capabilities. Review the scan results before using.
Convert text to speech using Microsoft Edge TTS with real-time streaming, customizable voice settings, and support for multiple languages including Chinese a...
openclaw skills install voice-edge-ttsText-to-speech skill using Microsoft Edge TTS engine with real-time streaming playback support.
pip install edge-tts
Windows:
Download from: https://github.com/GyanD/codexffmpeg/releases
Extract and add bin folder to PATH
macOS:
brew install ffmpeg
Linux:
sudo apt install ffmpeg
Real-time audio generation and playback:
// Basic usage
await skill.execute({
action: 'stream',
text: '你好,我是小九'
});
// With custom voice
await skill.execute({
action: 'stream',
text: 'Hello, how are you?',
options: {
voice: 'en-US-Standard-A',
rate: '+10%',
volume: '+0%',
pitch: '+0Hz'
}
});
await skill.execute({
action: 'tts',
text: 'Hello, how are you today?',
options: {
voice: 'zh-CN-XiaoxiaoNeural'
}
});
// Returns: { success: true, media: 'MEDIA: /path/to/file.mp3' }
await skill.execute({
action: 'speak',
text: 'Hello!'
});
await skill.execute({
action: 'voices'
});
| Language | Voice ID |
|---|---|
| Chinese (Female) | zh-CN-XiaoxiaoNeural |
| Chinese (Male) | zh-CN-YunxiNeural |
| Chinese (Male) | zh-CN-YunyangNeural |
| English (US Female) | en-US-Standard-A |
| English (US Male) | en-US-Standard-D |
| English (UK) | en-GB-Standard-A |
| Japanese | ja-JP-NanamiNeural |
| Korean | ko-KR-SunHiNeural |
| Option | Default | Description |
|---|---|---|
| voice | zh-CN-XiaoxiaoNeural | Voice ID |
| rate | +0% | Speech rate (-50% to +100%) |
| volume | +0% | Volume adjustment (-50% to +50%) |
| pitch | +0Hz | Pitch adjustment |
This skill implements enterprise-grade security best practices:
| Feature | Implementation |
|---|---|
| Input Validation | Voice parameter whitelist validation - only allowed voices can be used |
| No Shell Execution | Uses spawn() with array arguments instead of shell command concatenation |
| Command Injection Prevention | All user inputs are properly validated and escaped |
| Path Safety | Fixed script path prevents path traversal |
// ❌ UNSAFE - Don't use exec with string concatenation
exec(`py script.py "${userText}" --voice ${userVoice}`);
// ✅ SAFE - Use spawn with array arguments
spawn('py', [scriptPath, text, '--voice', voice], { shell: false });
Only these voices are allowed:
const allowedVoices = [
'zh-CN-XiaoxiaoNeural', 'zh-CN-YunxiNeural', 'zh-CN-YunyangNeural',
'zh-CN-YunyouNeural', 'zh-CN-XiaomoNeural',
'en-US-Standard-C', 'en-US-Standard-D', 'en-US-Wavenet-F',
'en-GB-Standard-A', 'en-GB-Wavenet-A',
'ja-JP-NanamiNeural', 'ko-KR-SunHiNeural'
];
Any invalid voice parameter will be rejected and replaced with the default voice.