Voice messaging setup

PassAudited by ClawScan on May 1, 2026.

Overview

This appears to be a legitimate voice-message setup, but it installs speech software, runs a local transcription script, and uses Edge TTS for voice output.

This skill looks purpose-aligned for OpenClaw voice messages. Before installing, review the bash setup, the unpinned faster-whisper dependency, the persistent files under ~/.openclaw/workspace, and whether Edge TTS is acceptable for the kinds of conversations you plan to use.

Findings (3)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

Note

ASI04: Agentic Supply Chain Vulnerabilities

What this means

Installing the skill may download and run third-party Python packages needed for transcription.

Why it was flagged

The skill installs faster-whisper from the package ecosystem without pinning a version. This is expected for the STT setup, but users inherit the supply-chain and version-change risk of that package and its dependencies.

Skill content

~/.openclaw/workspace/voice-messages/bin/pip install faster-whisper

Recommendation

Review the package source if needed, and consider pinning known-good package versions before installing in sensitive environments.

Note

ASI05: Unexpected Code Execution

What this means

OpenClaw will execute a local Python helper when processing voice messages.

Why it was flagged

The configuration makes OpenClaw run a local Python transcription script against media files. The behavior is central to the skill and the script is shown, with file-size and timeout limits also documented.

Skill content

"type": "cli", "command": "~/.openclaw/workspace/voice-messages/bin/python", "args": ["~/.openclaw/workspace/voice-messages/transcribe.py", "--audio", "{{MediaPath}}"

Recommendation

Install only if you are comfortable enabling local CLI-based audio transcription, and keep the helper script in the documented workspace path.

Note

ASI07: Insecure Inter-Agent Communication

What this means

Voice replies may be processed through the configured Edge TTS provider rather than being entirely local.

Why it was flagged

The skill configures Edge as the TTS provider for inbound voice-message replies. This is disclosed and purpose-aligned, but provider-based TTS can create an external data boundary for generated reply text.

Skill content

"messages": { "tts": { "auto": "inbound", "provider": "edge", "edge": { "voice": "en-US-JennyNeural"

Recommendation

Avoid enabling provider-based TTS for conversations where generated reply text should not leave your trusted environment.