Speech Synthesizer | 语音合成器
v1.0.2文字转语音(Text-to-Speech)工具。 支持 edge-tts(微软神经网络 TTS,免费离线)和 OpenAI 兼容 API TTS。 触发词:语音回复、TTS、文字转语音、语音合成、语音对话。 适用平台:Linux / Windows / macOS。
MIT-0
Security Scan
OpenClaw
Suspicious
medium confidencePurpose & Capability
The code and SKILL.md match the stated purpose (edge-tts + OpenAI-compatible API TTS). However the registry metadata does not declare environment variables or a primary credential even though the documentation and scripts expect OPENCLAW_WORKSPACE, TTS_API_URL and TTS_API_KEY. This mismatch is an incoherence in packaging/metadata.
Instruction Scope
Runtime instructions and scripts stay within the TTS purpose: they accept text, call edge-tts or an API, convert formats, and write outputs under the workspace. The SKILL.md does not instruct the agent to read unrelated files or exfiltrate data. It does, however, instruct network access to the API URL you supply and to Microsoft endpoints when using edge-tts (expected for these engines).
Install Mechanism
There is no install spec beyond 'pip install -r requirements.txt'. requirements.txt lists edge-tts, openai, aiohttp but the code imports 'av' (PyAV) for conversion which is not listed and requires native libs (ffmpeg/libav, libopus) to be installed. The lack of an explicit install spec and missing dependency entries increases risk of user error and hidden native dependencies.
Credentials
The skill uses an API key for API mode (TTS_API_KEY / --api-key) and reads OPENCLAW_WORKSPACE, but the registry metadata declared no required env vars or primary credential. Asking for an API key is appropriate for API mode, but the packaging omission is a red flag: the skill will send provided text (and any API key) to whatever api-url you pass, so only provide keys for endpoints you trust.
Persistence & Privilege
The skill does not request permanent inclusion (always:false) and does not modify other skills' configuration or system-wide settings. It runs as user-invoked/agent-invoked normally; no unusual privileges detected.
What to consider before installing
This skill appears to do what it says (edge-tts + OpenAI-style TTS), but the packaging is sloppy in ways you should address before trusting it with secrets or installing widely:
- The registry metadata does NOT declare the environment variables the SKILL.md and scripts use (OPENCLAW_WORKSPACE, TTS_API_URL, TTS_API_KEY). Treat that as an oversight and be cautious providing keys.
- The requirements.txt is incomplete: the scripts import 'av' (PyAV) and conversion to OGG/Opus requires native codecs (ffmpeg/libav, libopus). Install and test in a virtualenv or sandbox first, and install system packages (ffmpeg) separately.
- API mode will send text (and any API key you provide) to the api-url you supply — only use keys for endpoints you control or trust. The sample shows sk-xxx as an example; never paste real secrets into examples or chat without verifying context.
- edge-tts will contact Microsoft endpoints (downloads/voice data). If you need offline-only behavior, verify the voice assets are cached and acceptable for your environment.
If you plan to use this skill:
1) Review the two Python scripts locally to confirm behavior (they are short and readable). 2) Run pip install in an isolated virtualenv and manually install system deps (ffmpeg) before use. 3) Prefer passing API keys via CLI per-run rather than storing them in shared environment variables. 4) Consider contacting the skill owner or updating the registry metadata to declare required env vars and any native requirements. These steps will reduce risk; the current inconsistencies make the package suspicious but not evidently malicious.Like a lobster shell, security has layers — review code before you run it.
latest
License
MIT-0
Free to use, modify, and redistribute. No attribution required.
SKILL.md
Speech Synthesizer | 语音合成器 🔊
将文字转换为语音,支持微软神经网络 TTS(免费离线)和 OpenAI 兼容 API。
目录
概述
⚠️ 注意:OpenClaw 内置了
tts工具,但它的输出格式(MP3/WebM)不适合直接发送飞书语音。 飞书语音消息必须用tts_simple.py,它会自动输出 OGG/Opus 格式。
支持的 TTS 引擎
| 引擎 | 说明 | 优点 | 缺点 |
|---|---|---|---|
edge ⭐ | 微软神经网络 TTS | 免费、离线、高音质、支持中文 | 需网络下载初期 |
api | OpenAI 兼容 API | 质量高、可选声音多 | 需要 API Key |
edge-tts 支持的声音(完整列表)
中文(大陆):
| 声音 | 风格 |
|---|---|
zh-CN-XiaoxiaoNeural | 晓晓(女声,默认) |
zh-CN-YunxiNeural | 云希(男声) |
zh-CN-YunyangNeural | 云扬(男声) |
zh-CN-XiaoyiNeural | 晓伊(女声) |
中文(台湾):
| 声音 | 风格 |
|---|---|
zh-TW-HsiaoYuNeural | 小宇(女声) |
英文:
| 声音 | 风格 |
|---|---|
en-US-JennyNeural | 美式女声 |
en-US-GuyNeural | 美式男声 |
en-GB-SoniaNeural | 英式女声 |
en-GB-RyanNeural | 英式男声 |
en-AU-NatashaNeural | 澳式女声 |
en-IN-NeerjaNeural | 印度女声 |
日文:
| 声音 | 风格 |
|---|---|
ja-JP-NanamiNeural | 日语女声 |
ja-JP-MayuNeural | 日语男声 |
韩文:
| 声音 | 风格 |
|---|---|
ko-KR-SunHiNeural | 韩语女声 |
ko-KR-InJoonNeural | 韩语男声 |
其他常用:
| 声音 | 语言 |
|---|---|
fr-FR-DeniseNeural | 法语女声 |
de-DE-KatjaNeural | 德语女声 |
es-ES-ElviraNeural | 西班牙语女声 |
ru-RU-SvetlanaNeural | 俄语女声 |
pt-BR-FranciscaNeural | 葡萄牙语女声 |
💡 查看完整列表:
python3 scripts/tts_edge.py "test" --list-voices
快速开始
1. 安装依赖
cd ~/.openclaw/workspace/skills/speech-synthesizer
pip install -r requirements.txt
2. 运行(生成飞书语音用这个)
# ⭐ tts_simple.py — 输出 OGG/Opus,可直接作为飞书语音发送
python3 scripts/tts_simple.py "你好,这是测试语音"
# 使用指定声音(见下方声音列表)
python3 scripts/tts_simple.py "你好" --voice zh-CN-YunxiNeural
# 使用 API
python3 scripts/tts_simple.py "你好" --engine api \
--api-url https://api.openai.com/v1 \
--api-key sk-xxx \
--voice alloy
3. 调节语速和音调
# 语速 +10%(稍快)
python3 scripts/tts_simple.py "快速播报" --rate "+10%"
# 语速 -10%(稍慢)
python3 scripts/tts_simple.py "慢速播报" --rate "-10%"
# 音调升高
python3 scripts/tts_simple.py "音调较高" --pitch "+5Hz"
脚本说明
scripts/tts_simple.py ⭐ 推荐
通用的文字转语音脚本,自动输出 OGG/Opus 格式,适合飞书语音消息。
python3 scripts/tts_simple.py "要转换的文字" [选项]
参数:
| 参数 | 说明 |
|---|---|
text | 要转换的文字,或 .txt 文件路径 |
--output, -o | 输出文件路径 |
--engine, -e | 引擎:edge(默认)或 api |
--voice, -v | 声音名称 |
--rate, -r | 语速,如 +10%、-5%(仅 edge) |
--pitch, -p | 音调,如 +5Hz、-3Hz(仅 edge) |
--api-url | API URL(api 模式) |
--api-key | API Key(api 模式) |
--api-model | API 模型(默认 tts-1) |
scripts/tts_edge.py
纯粹的 edge-tts 脚本,输出 MP3 格式(不适合直接发送飞书语音)。
# 列出所有声音
python3 scripts/tts_edge.py "test" --list-voices
# 生成语音
python3 scripts/tts_edge.py "你好" -o output.mp3 --voice zh-CN-Xiaoxiao
💡 发送飞书语音消息请用
tts_simple.py。
声音列表
edge-tts 支持 100+ 声音,可通过以下命令查看:
python3 scripts/tts_edge.py "test" --list-voices
常用声音速查:
| 语言 | 代码 | 声音 |
|---|---|---|
| 中文女声 | zh-CN-Xiaoxiao | 晓晓(默认) |
| 中文男声 | zh-CN-Yunxi | 云希 |
| 美式女声 | en-US-Jenny | Jenny |
| 美式男声 | en-US-Guy | Guy |
| 英式女声 | en-GB-Sonia | Sonia |
| 日语女声 | ja-JP-Nanami | Nanami |
输出格式
重要:飞书语音消息需要 OGG/Opus 格式,必须使用 tts_simple.py。
| 脚本 | 输出格式 | 适用场景 |
|---|---|---|
tts_simple.py ⭐ | OGG/Opus | 飞书语音消息(直接发送) |
tts_edge.py | MP3 | 通用场景(需转换后才能发飞书语音) |
tts_simple.py会自动将 edge-tts 输出的 webm 转换为 OGG/Opus,专门适配飞书语音消息。
输出目录
运行结果保存在工作区的 projects/tts/ 目录下:
~/.openclaw/workspace/projects/tts/
└── output/
└── tts_20260401_193000.ogg # OGG/Opus 格式(飞书语音用这个)
环境变量
| 变量名 | 说明 |
|---|---|
OPENCLAW_WORKSPACE | 工作区根目录 |
TTS_API_URL | OpenAI 兼容 API URL |
TTS_API_KEY | API 密钥 |
故障排查
edge-tts 下载失败
# 检查网络
curl -I https://www.bing.com
# edge-tts 需要访问微软服务
# 如有代理干扰,清除环境变量
unset all_proxy ALL_PROXY
API 模式报错
- 确认 API URL 正确(包含
/v1) - 确认 API Key 有效
- 检查账户余额
声音不自然
- 中文推荐:
zh-CN-Xiaoxiao(晓晓) - 调节语速
--rate "+10%"或--rate "-10%" - 调节音调
--pitch "+3Hz"或--pitch "-3Hz"
平台说明
edge-tts vs 其他 TTS 方案
| 方案 | 成本 | 音质 | 中文支持 | 离线 | API Key |
|---|---|---|---|---|---|
| edge-tts | 免费 | 高 | 很好 | 部分 | 不需要 |
| OpenAI TTS | 按量计费 | 很高 | 一般 | 否 | 需要 |
| pyttsx3 | 免费 | 低 | 一般 | 是 | 不需要 |
已知限制
- edge-tts 依赖微软服务,需要能访问
edge.microsoft.com - 部分声音在不同地区可能不可用
目录结构
speech-synthesizer/
├── SKILL.md # 本文档
├── requirements.txt # Python 依赖
├── scripts/
│ ├── tts_simple.py # 通用 TTS 脚本(⭐ 推荐)
│ └── tts_edge.py # edge-tts 专用脚本
└── models/ # 模型目录(如需本地模型)
Files
4 totalSelect a file
Select a file to preview.
Comments
Loading comments…
