Install
openclaw skills install openclaw-self-hosted-whisperClawHub Security found sensitive or high-impact capabilities. Review the scan results before using.
Transcribe audio via the self-hosted Whisper ASR instance running on Kubernetes. Use this skill whenever the user wants to transcribe audio files, convert speech to text, generate subtitles, or translate audio. Triggers on audio transcription, speech-to-text, whisper, voice-to-text, subtitle generation, or audio translation requests.
openclaw skills install openclaw-self-hosted-whisperTranscribe an audio file via the Whisper ASR webservice at http://whisper-asr.whisper-asr.svc.cluster.local:9000.
Uses the onerahmet/openai-whisper-asr-webservice API (/asr endpoint).
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a
Defaults:
http://whisper-asr.whisper-asr.svc.cluster.local:9000/asrtranscribetxt{baseDir}/scripts/transcribe.sh /path/to/audio.ogg --language en --out /tmp/transcript.txt
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --language de
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --json --out /tmp/transcript.json
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --output srt --out /tmp/subtitles.srt
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --output vtt
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --translate
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --vad-filter --json
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --word-timestamps --json
--output formats: txt, json, vtt, srt, tsv--translate produces an English transcript regardless of source language--vad-filter enables voice activity detection to skip silent sections--word-timestamps adds word-level timing (use with --json)http://whisper-asr.whisper-asr.svc.cluster.local:9000/docs