Install
openclaw skills install voice-to-textConvert voice messages and audio files to text using Vosk offline speech recognition. Use when a user sends a voice message, audio file, or asks to transcribe speech to text.
openclaw skills install voice-to-textConvert voice messages and audio files to text using Vosk, an offline speech recognition toolkit.
Install dependencies:
# macOS
brew install ffmpeg
pip install vosk
# Linux
apt-get install ffmpeg
pip install vosk
Download a Vosk model:
mkdir -p ~/.vosk/models && cd ~/.vosk/models
# Chinese (small, fast)
curl -LO https://alphacephei.com/vosk/models/vosk-model-small-cn-0.22.zip
unzip vosk-model-small-cn-0.22.zip
# English (small)
curl -LO https://alphacephei.com/vosk/models/vosk-model-small-en-us-0.15.zip
unzip vosk-model-small-en-us-0.15.zip
When the user provides a voice message or audio file path, run the transcription:
python3 ~/skills/voice-to-text/transcribe.py "<audio_file_path>"
For specific model selection, set the environment variable:
VOSK_MODEL_PATH=~/.vosk/models/vosk-model-cn-0.22 python3 ~/skills/voice-to-text/transcribe.py "<audio_file_path>"
| Model | Language | Size | Notes |
|---|---|---|---|
| vosk-model-small-cn-0.22 | Chinese | 42M | Fast, good accuracy |
| vosk-model-cn-0.22 | Chinese | 1.3G | High accuracy |
| vosk-model-small-en-us-0.15 | English | 40M | Fast, good accuracy |
| vosk-model-en-us-0.22 | English | 1.8G | High accuracy |
Download models from: https://alphacephei.com/vosk/models
python3 transcribe.py /path/to/voice.ogg~/.vosk/models/brew install ffmpeg or apt install ffmpeg