Speechmatics (batch transcription)
Transcribe an audio file via Speechmatics' async batch API. Submits a job, polls until complete, then writes the transcript.
Quick start
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a
Defaults:
- Language:
en
- Operating point:
enhanced (better accuracy; use standard for faster/cheaper)
- Output:
<input>.txt in the same directory
- Poll interval: 3s, timeout: 600s
Useful flags
{baseDir}/scripts/transcribe.sh /path/to/audio.ogg --language da
{baseDir}/scripts/transcribe.sh /path/to/meeting.wav --operating-point standard
{baseDir}/scripts/transcribe.sh /path/to/call.mp3 --diarization speaker
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --format json --out /tmp/transcript.json
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --format srt --out /tmp/subs.srt
{baseDir}/scripts/transcribe.sh /path/to/long.wav --timeout 1800
Formats: txt (default, plain text), json (Speechmatics json-v2 with word timings), srt (subtitles).
API key
The script reads the API key from (in order):
--api-key <key> flag
SPEECHMATICS_API_KEY environment variable (set by openclaw from the entry below)
skills.entries.speechmatics.apiKey in $OPENCLAW_CONFIG_PATH (default ~/.openclaw/openclaw.json)
Configure via openclaw.json:
{
skills: {
entries: {
speechmatics: {
apiKey: "SPEECHMATICS_KEY_HERE",
},
},
},
}
Override the API base (e.g. EU region or a proxy) with --base-url or SPEECHMATICS_BASE_URL. Default: https://asr.api.speechmatics.com/v2.
Notes
- Supports most common audio formats (wav, mp3, m4a, ogg, flac, mp4, etc.) — Speechmatics transcodes server-side.
- File size limit: 2 GB per job.
- Batch jobs complete in roughly 1:10 wallclock to audio duration on
enhanced; standard is faster.
- Always confirm any destructive follow-up (e.g. replying based on a transcript) before acting.