Voice messaging setup
PassAudited by ClawScan on May 1, 2026.
Overview
This appears to be a legitimate voice-message setup, but it installs speech software, runs a local transcription script, and uses Edge TTS for voice output.
This skill looks purpose-aligned for OpenClaw voice messages. Before installing, review the bash setup, the unpinned faster-whisper dependency, the persistent files under ~/.openclaw/workspace, and whether Edge TTS is acceptable for the kinds of conversations you plan to use.
Findings (3)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
Installing the skill may download and run third-party Python packages needed for transcription.
The skill installs faster-whisper from the package ecosystem without pinning a version. This is expected for the STT setup, but users inherit the supply-chain and version-change risk of that package and its dependencies.
~/.openclaw/workspace/voice-messages/bin/pip install faster-whisper
Review the package source if needed, and consider pinning known-good package versions before installing in sensitive environments.
OpenClaw will execute a local Python helper when processing voice messages.
The configuration makes OpenClaw run a local Python transcription script against media files. The behavior is central to the skill and the script is shown, with file-size and timeout limits also documented.
"type": "cli", "command": "~/.openclaw/workspace/voice-messages/bin/python", "args": ["~/.openclaw/workspace/voice-messages/transcribe.py", "--audio", "{{MediaPath}}"Install only if you are comfortable enabling local CLI-based audio transcription, and keep the helper script in the documented workspace path.
Voice replies may be processed through the configured Edge TTS provider rather than being entirely local.
The skill configures Edge as the TTS provider for inbound voice-message replies. This is disclosed and purpose-aligned, but provider-based TTS can create an external data boundary for generated reply text.
"messages": { "tts": { "auto": "inbound", "provider": "edge", "edge": { "voice": "en-US-JennyNeural"Avoid enabling provider-based TTS for conversations where generated reply text should not leave your trusted environment.
