Install
openclaw skills install jimmy-claw-mlx-whisperSet up mlx-whisper as the local audio transcription engine for OpenClaw on Apple Silicon Macs (M1/M2/M3/M4). Automatically transcribes voice notes sent via T...
openclaw skills install jimmy-claw-mlx-whisperEnables automatic transcription of voice notes in OpenClaw using Apple's MLX framework. No API key required. Works fully offline. ~60× faster than standard Whisper on M1/M2/M3/M4.
.ogg / WhatsApp .opus)mlx-whisper-transcribe.sh via {{MediaPath}}pip3 install mlx-whisper
Verify:
python3 -c "import mlx_whisper; print('OK')"
Find the Python bin path:
python3 -m site --user-base
# e.g. /Users/<you>/Library/Python/3.9
Copy bin/mlx-whisper-transcribe.sh from this skill to <user-base>/bin/mlx-whisper-transcribe.sh, then make it executable:
PYBIN=$(python3 -m site --user-base)/bin
cp {baseDir}/bin/mlx-whisper-transcribe.sh "$PYBIN/mlx-whisper-transcribe.sh"
chmod +x "$PYBIN/mlx-whisper-transcribe.sh"
Test it:
"$PYBIN/mlx-whisper-transcribe.sh" /path/to/audio.ogg
# First run downloads the model (~465MB). Subsequent runs are instant.
Add to ~/.openclaw/openclaw.json under tools.media.audio:
{
"tools": {
"media": {
"audio": {
"enabled": true,
"models": [
{
"type": "cli",
"command": "<user-base>/bin/mlx-whisper-transcribe.sh",
"args": ["{{MediaPath}}"],
"timeoutSeconds": 60
}
]
}
}
}
}
Replace <user-base> with the output of python3 -m site --user-base.
openclaw gateway restart
Or restart the OpenClaw app from the menu bar.
The wrapper uses whisper-small-mlx by default (465MB, good balance of speed and accuracy).
To change, edit bin/mlx-whisper-transcribe.sh and update path_or_hf_repo:
| Model | Size | Use case |
|---|---|---|
mlx-community/whisper-tiny-mlx | 75MB | Fastest, basic accuracy |
mlx-community/whisper-small-mlx | 465MB | Recommended |
mlx-community/whisper-medium-mlx | 1.5GB | Higher accuracy |
mlx-community/whisper-large-v3-mlx | 3GB | Best accuracy |
Pass a language code as the second argument to skip auto-detection (faster):
mlx-whisper-transcribe.sh audio.ogg zh # Chinese
mlx-whisper-transcribe.sh audio.ogg en # English
In openclaw.json, add the language to args:
"args": ["{{MediaPath}}", "zh"]
| Audio length | Transcription time |
|---|---|
| 10 sec | ~1 sec |
| 1 min | ~7 sec |
| 30 min | ~3.5 min |
mlx_whisper not found: Run pip3 install mlx-whisper againtimeoutSeconds for long audio files"language": "zh" or the target language code to args~/.cache/huggingface