{"skill":{"slug":"whisper-mlx-local","displayName":"Local Whisper","summary":"Free local speech-to-text for Telegram and WhatsApp using MLX Whisper on Apple Silicon. Private, no API costs.","description":"---\nname: whisper-mlx-local\ndescription: \"Free local speech-to-text for Telegram and WhatsApp using MLX Whisper on Apple Silicon. Private, no API costs.\"\nmetadata:\n  openclaw:\n    emoji: \"🎤\"\n    version: \"1.5.0\"\n    author: \"Community\"\n    repo: \"https://github.com/ImpKind/local-whisper\"\n    requires:\n      os: [\"darwin\"]\n      arch: [\"arm64\"]\n      bins: [\"python3\"]\n    install:\n      - id: \"deps\"\n        kind: \"manual\"\n        label: \"Install dependencies\"\n        instructions: \"pip3 install -r requirements.txt\"\n---\n\n# Local Whisper\n\n**Transcribe voice messages for free on Telegram and WhatsApp.** No API keys. No costs. Runs on your Mac.\n\n## The Problem\n\nVoice transcription APIs cost money:\n- OpenAI Whisper: **$0.006/minute**\n- Groq: **$0.001/minute**  \n- AssemblyAI: **$0.01/minute**\n\nIf you transcribe a lot of Telegram voice messages, it adds up.\n\n## The Solution\n\nThis skill runs Whisper **locally on your Mac**. Same quality, **zero cost**.\n\n- ✅ Free forever\n- ✅ Private (audio never leaves your Mac)\n- ✅ Fast (~1 second per message)\n- ✅ Works offline\n\n## ⚠️ Important Notes\n\n- **First run downloads ~1.5GB model** — be patient, this only happens once\n- **First transcription is slow** — model loads into memory (~10-30 seconds), then it's instant\n- **Already using OpenAI API for transcription?** Replace your existing `tools.media.audio` config with the one below\n\n## Quick Start\n\n### 1. Install dependencies\n```bash\npip3 install -r requirements.txt\n```\n\n### 2. Start the daemon\n```bash\npython3 scripts/daemon.py\n```\nFirst run will download the Whisper model (~1.5GB). Wait for \"Ready\" message.\n\n### 3. Add to OpenClaw config\n\nAdd this to your `~/.openclaw/openclaw.json`:\n\n```json\n{\n  \"tools\": {\n    \"media\": {\n      \"audio\": {\n        \"enabled\": true,\n        \"models\": [\n          {\n            \"type\": \"cli\",\n            \"command\": \"~/.openclaw/workspace/skills/local-whisper/scripts/transcribe.sh\",\n            \"args\": [\"{{MediaPath}}\"],\n            \"timeoutSeconds\": 60\n          }\n        ]\n      }\n    }\n  }\n}\n```\n\n### 4. Restart gateway\n```bash\nopenclaw gateway restart\n```\n\nNow voice messages from Telegram, WhatsApp, etc. will be transcribed locally for free!\n\n### Manual test\n```bash\n./scripts/transcribe.sh voice_message.ogg\n```\n\n## Use Case: Telegram Voice Messages\n\nInstead of paying for OpenAI API to transcribe incoming voice messages, point OpenClaw to this local daemon. Free transcription forever.\n\n## Auto-Start on Login\n\n```bash\ncp com.local-whisper.plist ~/Library/LaunchAgents/\nlaunchctl load ~/Library/LaunchAgents/com.local-whisper.plist\n```\n\n## API\n\nDaemon runs at `localhost:8787`:\n\n```bash\ncurl -X POST http://localhost:8787/transcribe -F \"file=@audio.ogg\"\n# {\"text\": \"Hello world\", \"language\": \"en\"}\n```\n\n## Translation\n\nAny language → English:\n\n```bash\n./scripts/transcribe.sh spanish_audio.ogg --translate\n```\n\n## Requirements\n\n- macOS with Apple Silicon (M1/M2/M3/M4)\n- Python 3.9+\n\n## License\n\nMIT\n","tags":{"latest":"1.5.0"},"stats":{"comments":0,"downloads":3581,"installsAllTime":2,"installsCurrent":2,"stars":9,"versions":5},"createdAt":1769897944411,"updatedAt":1779076568681},"latestVersion":{"version":"1.5.0","createdAt":1769908058400,"changelog":"Added OpenClaw configuration instructions","license":null},"metadata":{"setup":[],"os":null,"systems":null},"owner":{"handle":"impkind","userId":"s17a6jgq93ckexmb6qdvze637s84538n","displayName":"ImpKind","image":"https://avatars.githubusercontent.com/u/54905270?v=4"},"moderation":{"isSuspicious":false,"isMalwareBlocked":false,"verdict":"clean","reasonCodes":["review.llm_review"],"summary":"Review: review.llm_review","engineVersion":"v2.4.24","updatedAt":1779922032346}}