K8s Self Hosted Whisper Api

v0.1.0

Transcribe audio via the self-hosted Whisper ASR instance running on Kubernetes. Use this skill whenever the user wants to transcribe audio files, convert sp...

0· 426· 1 versions· 1 current· 1 all-time· Updated 38m ago· MIT-0

Install

openclaw skills install openclaw-self-hosted-whisper

Self-Hosted Whisper API (curl)

Transcribe an audio file via the Whisper ASR webservice at http://whisper-asr.whisper-asr.svc.cluster.local:9000.

Uses the onerahmet/openai-whisper-asr-webservice API (/asr endpoint).

Quick start

{baseDir}/scripts/transcribe.sh /path/to/audio.m4a

Defaults:

  • Endpoint: http://whisper-asr.whisper-asr.svc.cluster.local:9000/asr
  • Task: transcribe
  • Output: txt

Useful flags

{baseDir}/scripts/transcribe.sh /path/to/audio.ogg --language en --out /tmp/transcript.txt
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --language de
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --json --out /tmp/transcript.json
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --output srt --out /tmp/subtitles.srt
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --output vtt
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --translate
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --vad-filter --json
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --word-timestamps --json

Notes

  • Supported --output formats: txt, json, vtt, srt, tsv
  • --translate produces an English transcript regardless of source language
  • --vad-filter enables voice activity detection to skip silent sections
  • --word-timestamps adds word-level timing (use with --json)
  • The model is configured on the server side (ASR_MODEL env var), not per request
  • Swagger docs available at http://whisper-asr.whisper-asr.svc.cluster.local:9000/docs
  • No authentication required

Version tags

latestvk972zcb30eh6v71s0jhx8gbacs8223xf

Runtime requirements

🎙️ Clawdis
Binscurl