Install
openclaw skills install openclaw-whisper-voiceLocal Whisper speech-to-text for audio files and inbound voice notes on the OpenClaw Gateway host. Use when setting up local transcription for WhatsApp, Telegram, or other audio attachments; when configuring tools.media.audio with a CLI fallback instead of a cloud API; or when you need a reusable shell entrypoint that makes Whisper + ffmpeg work reliably on Linux.
openclaw skills install openclaw-whisper-voiceUse this skill to make local Whisper transcription dependable on the OpenClaw Gateway host.
Run:
{baseDir}/scripts/install_local_whisper.sh
The installer:
~/.localopenai-whisperimageio-ffmpeg~/.local/bin/whisper and ~/.local/bin/ffmpeg launchersUse the wrapper instead of raw whisper when reliability matters:
{baseDir}/scripts/transcribe.sh /path/to/audio.ogg
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --model tiny --stdout-only
{baseDir}/scripts/transcribe.sh /path/to/audio.mp3 --task translate --format srt
Patch OpenClaw config so inbound audio uses the wrapper:
{
tools: {
media: {
audio: {
enabled: true,
maxBytes: 20971520,
timeoutSeconds: 120,
models: [
{
type: "cli",
command: "{baseDir}/scripts/transcribe.sh",
args: ["{{MediaPath}}", "--model", "base", "--stdout-only"],
timeoutSeconds: 120
}
]
}
}
}
}
tiny: fastest, weakest accuracybase: best default for chat voice notessmall or larger: better accuracy, heavier CPU and RAM use--stdout-only for tools.media.audio so stdout is only transcript text.--format txt|srt|vtt|json for standalone file transcription.~/.cache/whisper.