Install
openclaw skills install sst-simpleLocal speech-to-text using OpenAI Whisper. Use when the user needs to: (1) transcribe audio files to text, (2) convert voice messages to written content, (3)...
openclaw skills install sst-simple使用此技能当用户需要 / Use this skill when user needs to:
/root/.openclaw/venv/stt-simple/bin/whisper --version
如果未安装,先运行安装脚本 / If not installed, run install script first:
/root/.openclaw/workspace/skills/stt-simple/scripts/install.sh
| 模型 / Model | 大小 / Size | 速度 / Speed | 精度 / Accuracy | 推荐场景 / Recommended For |
|---|---|---|---|---|
tiny | 39MB | ⚡⚡⚡ | ⭐⭐⭐ | 快速测试 / Quick testing |
base | 74MB | ⚡⚡ | ⭐⭐⭐⭐ | 日常使用 / Daily use |
small | 244MB | ⚡ | ⭐⭐⭐⭐⭐ | 默认推荐 / Default |
medium | 769MB | 🐌 | ⭐⭐⭐⭐⭐ | 高精度需求 / High accuracy |
large | 1.5GB | 🐌🐌 | ⭐⭐⭐⭐⭐+ | 最佳质量 / Best quality |
方法 A: 使用 Whisper 命令行 / Use Whisper CLI
/root/.openclaw/venv/stt-simple/bin/whisper <audio_file> --model small --language Chinese
方法 B: 使用 Python 脚本(推荐支持多 Agent) / Use Python Script (Recommended for Multi-Agent)
# Without session isolation / 无会话隔离
/root/.openclaw/venv/stt-simple/bin/python \
/root/.openclaw/workspace/skills/stt-simple/scripts/stt_simple.py \
<audio_file> small zh
# With session isolation / 带会话隔离(多 Agent 场景)
/root/.openclaw/venv/stt-simple/bin/python \
/root/.openclaw/workspace/skills/stt-simple/scripts/stt_simple.py \
<audio_file> small zh agent-main-whatsapp
/root/.openclaw/workspace/stt_output/<session_id>/<filename>_<timestamp>.txt| 语言 / Language | 代码 / Code | 别名 / Alias |
|---|---|---|
| 中文 / Chinese | zh | Chinese |
| 英文 / English | en | English |
| 日文 / Japanese | ja | Japanese |
| 韩文 / Korean | ko | Korean |
| 法文 / French | fr | French |
| 德文 / German | de | German |
| 西班牙文 / Spanish | es | Spanish |
自动检测 / Auto-detect: 省略 --language 参数 / Omit --language parameter
.txt - 纯文本 / Plain text (default).json - 完整结果(含时间戳、置信度)/ Full results (with timestamps, confidence).srt - 字幕格式(视频用)/ Subtitle format (for videos).vtt - WebVTT(网页用)/ WebVTT (for web)/root/.openclaw/venv/stt-simple/bin/whisper --version
rm -rf /root/.openclaw/venv/stt-simple
/root/.openclaw/workspace/skills/stt-simple/scripts/install.sh
/root/.openclaw/venv/stt-simple/bin/python \
-c "import whisper; whisper.load_model('small')"
| 文件 / File | 路径 / Path | 用途 / Purpose |
|---|---|---|
| 安装脚本 / Install script | scripts/install.sh | 一键安装虚拟环境、依赖、模型 / One-click install venv, dependencies, models |
| Python 脚本 / Python script | scripts/stt_simple.py | 简化的转录 API,返回 JSON 结果 / Simplified transcription API with JSON output |
/root/.openclaw/venv/stt-simple/bin/whisper \
/root/.openclaw/media/inbound/voice.ogg \
--model small --language Chinese
/root/.openclaw/venv/stt-simple/bin/whisper \
meeting.wav --model medium --language en
/root/.openclaw/venv/stt-simple/bin/python \
/root/.openclaw/workspace/skills/stt-simple/scripts/stt_simple.py \
audio.ogg small zh
# Jari (WhatsApp) - outputs to /root/.openclaw/workspace/stt_output/agent-jari-whatsapp/
/root/.openclaw/venv/stt-simple/bin/python \
/root/.openclaw/workspace/skills/stt-simple/scripts/stt_simple.py \
voice_a.ogg small zh agent-jari-whatsapp
# Other Agent (Telegram) - outputs to /root/.openclaw/workspace/stt_output/agent-telegram/
/root/.openclaw/venv/stt-simple/bin/python \
/root/.openclaw/workspace/skills/stt-simple/scripts/stt_simple.py \
voice_b.ogg small zh agent-telegram
当多个 Agent 同时使用 STT 功能时:
| Agent / 场景 | 推荐 session_id | 输出目录 / Output Directory |
|---|---|---|
| Jari (WhatsApp) | agent-jari-whatsapp | stt_output/agent-jari-whatsapp/ |
| Eric (WhatsApp) | agent-eric-whatsapp | stt_output/agent-eric-whatsapp/ |
| Telegram Agent | agent-telegram | stt_output/agent-telegram/ |
| 临时会话 | session-<uuid> | stt_output/session-<uuid>/ |
| 用户专属 | user-<user_id> | stt_output/user-<user_id>/ |
<audio_filename>_<unique_timestamp>.txt
例如 / For example:
voice_a_3f8b2c1d.txtmeeting_9a4e7f2b.txt每个文件名包含唯一的时间戳后缀,即使同一音频多次转录也不会覆盖。 Each filename includes a unique timestamp suffix, preventing overwrites even for repeated transcriptions.
/root/.openclaw/workspace/stt_output/ / Results saved to /root/.openclaw/workspace/stt_output/当前会话标识符 / Current Session ID:
agent-jari-whatsapp
输出目录 / Output Directory:
/root/.openclaw/workspace/stt_output/agent-jari-whatsapp/
快速调用 / Quick Start:
# 转录当前 WhatsApp 语音消息 / Transcribe current WhatsApp voice message
/root/.openclaw/venv/stt-simple/bin/python \
/root/.openclaw/workspace/skills/stt-simple/scripts/stt_simple.py \
<audio_file> small zh agent-jari-whatsapp