Install
openclaw skills install hf-whisper-speech-to-textTranscribe or translate audio files to text using a public Hugging Face Whisper Space over Gradio. Use when the user sends voice notes, audio attachments, meeting clips, podcasts, interviews, or any local audio file (.ogg, .mp3, .wav, .m4a, etc.) and wants a transcript, rough captions, or an English translation without relying on paid APIs first.
openclaw skills install hf-whisper-speech-to-textUse this skill to turn local audio files into text with a public Whisper-based endpoint.
Run:
python3 scripts/transcribe.py /path/to/file.ogg
Return the transcript as plain text. By default, the script also applies lightweight Chinese punctuation and sentence-breaking cleanup.
For machine-readable output:
python3 scripts/transcribe.py /path/to/file.ogg --json
To disable cleanup and keep the raw model text:
python3 scripts/transcribe.py /path/to/file.ogg --format raw
To force Chinese punctuation cleanup:
python3 scripts/transcribe.py /path/to/file.ogg --format zh
For English translation instead of same-language transcription:
python3 scripts/transcribe.py /path/to/file.ogg --task translate
scripts/transcribe.py on it.The script:
Default endpoint:
https://hf-audio-whisper-large-v3-turbo.hf.spaceOverride it with:
python3 scripts/transcribe.py input.ogg --space https://your-space.hf.space
or set:
export HF_WHISPER_SPACE=https://your-space.hf.space
Prefer to return:
scripts/transcribe.py — public Whisper transcription helper