Install
openclaw skills install whisperxWhisperX provides local speech-to-text transcription using OpenAI Whisper, with high-quality offline recognition, no API key required, word-level timestamps, and optional speaker diarization.
openclaw skills install whisperxLocal offline speech-to-text - A WhisperX-powered speech recognition skill for OpenClaw. Up to 30x faster than standard OpenAI Whisper, runs fully offline with no API key required.
# Install ffmpeg (macOS)
brew install ffmpeg
# Install ffmpeg (Ubuntu/Debian)
apt-get install ffmpeg
# Install WhisperX
pip install whisperx
# or using uvx:
uvx whisperx
GPU users: ensure CUDA 12.8 is installed for faster inference.
# Basic transcription (auto-detect language)
whisperx path/to/audio.wav
# Specify model and language
whisperx --model small --language zh path/to/audio.wav
# CPU mode (low memory)
whisperx --model small --device cpu --compute_type int8 path/to/audio.wav
whisperx, ffmpeg~/.cache/whisper/ on first runbase or small for CPU; large-v3 for GPU