Install
openclaw skills install telegram-voice-to-voice-macosTelegram voice-to-voice for macOS Apple Silicon: transcribe inbound .ogg voice notes with yap (Speech.framework) and reply with Telegram voice notes via say+ffmpeg. Not compatible with Linux/Windows.
openclaw skills install telegram-voice-to-voice-macosThis is an OpenClaw skill.
yap CLI available in PATH (Speech.framework transcription).
ffmpeg available in PATH.This skill is macOS-only (uses say + Speech.framework). The skill registry cannot enforce OS restrictions, so installing/running it on Linux/Windows will result in runtime failures.
Store a small per-user preference file in the workspace:
voice_state/telegram.json"voice" (default): reply with a Telegram voice note"text": reply with a single text messageIf the file does not exist or the sender id is missing: assume "voice".
If an inbound text message is exactly:
/audio off → set state to "text" and confirm with a short text reply./audio on → set state to "voice" and confirm with a short text reply.Telegram voice notes often show up as <media:audio> in message text.
OpenClaw saves the attachment to disk (typically .ogg) under:
~/.openclaw/media/inbound/Recommended approach:
*.ogg from ~/.openclaw/media/inbound/.Default locale: macOS system locale.
Optional env:
YAP_LOCALE — override the transcription locale (e.g. it-IT, en-US).Preferred:
yap transcribe --locale "${YAP_LOCALE:-<system>}" <path.ogg>
YAP_LOCALE is not set, the helper script will use the macOS system locale (from defaults read -g AppleLocale).If transcription fails or is empty: ask the user to repeat or send text.
Helper script:
scripts/transcribe_telegram_ogg.sh [path.ogg]Voice default: SYSTEM (uses the current macOS system voice). You can override by passing a specific voice name to the helper script.
scripts/tts_telegram_voice.sh "<reply text>" [SYSTEM|VoiceName]The script prints the generated .ogg path to stdout.
.ogg back to Telegram as a voice note (not a generic audio file):message tool with asVoice: true and media: <path.ogg>replyTo to thread the responseNotes:
SYSTEM to rely on the current macOS system voice (recommended).Reply with a single text message:
Transcription: <...>Reply: <...>