Telegram Voice To Voice Macos

v0.1.3

Telegram voice-to-voice for macOS Apple Silicon: transcribe inbound .ogg voice notes with yap (Speech.framework) and reply with Telegram voice notes via say+ffmpeg. Not compatible with Linux/Windows.

0· 1.6k·4 current·4 all-time
byFiberian@fiberian1981
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description align with what the skill asks for: yap for Speech.framework transcription, say + ffmpeg for TTS/encoding, and defaults for reading macOS locale. The included helper scripts implement transcription and TTS and operate on the documented ~/.openclaw media/workspace paths.
Instruction Scope
SKILL.md instructs the agent to read inbound .ogg files from ~/.openclaw/media/inbound and to write reply files under workspace paths; the helper scripts do the transcription and TTS but do not implement the per-user 'voice_state/telegram.json' preference logic described in SKILL.md (that state management is expected to be done by the agent). The instructions do not request secrets or contact unknown external endpoints — sending replies is delegated to the agent's message tool as expected.
Install Mechanism
No install spec (instruction-only plus two small shell scripts). Nothing downloads or executes remote code; risk from install-time actions is low.
Credentials
The skill requires no credentials or sensitive environment variables. It accesses files under the user's home (~/.openclaw/*) and the macOS system locale, which are proportionate to the described functionality.
Persistence & Privilege
The skill is not always-enabled and does not request elevated privileges, but it does write/read files in the user's home (~/.openclaw/workspace and voice_state paths). Autonomous invocation is allowed by default (normal for skills); this combined with file I/O is expected for the workflow but worth noticing.
Assessment
This skill appears to do exactly what it says: transcribe .ogg voice notes locally (yap) and produce Telegram voice notes via say+ffmpeg. Before installing, confirm you are on macOS Apple Silicon and that you trust the local 'yap' and 'ffmpeg' binaries you will provide. Understand the skill will read inbound .ogg files from ~/.openclaw/media/inbound and create TTS output in ~/.openclaw/workspace/voice_out and (per the SKILL.md) expects a per-user state file voice_state/telegram.json in the workspace — the provided scripts don't manage that state file, so the agent must handle toggling between voice/text. No network endpoints or credentials are requested by the skill itself.

Like a lobster shell, security has layers — review code before you run it.

latestvk9769s77fttw22c6dc0a346frd810a50

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Runtime requirements

OSmacOS
Binsyap, ffmpeg, say, defaults

Comments