whatsappVoiceOpenSkill
WarnAudited by ClawScan on May 10, 2026.
Overview
The skill is mostly aligned with WhatsApp voice processing, but it uses an unsafe shell command for transcription and should be reviewed before running.
Before installing, patch the shell execution to use safe argument passing, run the daemon only when you want automatic WhatsApp voice processing, keep transcript logs private, pin dependencies, and add confirmations or sender restrictions before using custom handlers for real-world actions.
Findings (5)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
A malicious or malformed audio path could run commands with the user's local permissions in integrations that call transcribeVoiceNote on user-controlled paths.
The exported transcription function builds a shell command by interpolating audioFilePath. If an integration passes an untrusted path containing quotes or shell metacharacters, it could break out of the quoted argument.
execSync(`python "${transcribeScript}" "${audioFilePath}"`, { maxBuffer: 10 * 1024 * 1024, encoding: 'utf8' })Replace execSync shell-string usage with execFile or spawn using an argument array, validate audio paths, and create temporary files in a controlled directory.
Private inbound voice messages in the watched directory will be processed automatically until the daemon is stopped.
When started by the user, the listener keeps polling for new WhatsApp voice files every five seconds and marks processed files in a log.
const interval = setInterval(checkForNewVoices, CONFIG.checkInterval);
Run the daemon only when intended, monitor its logs, and stop it when automatic processing is no longer needed.
Voice transcripts may appear in logs or be consumed by whatever parent process is watching stdout.
The daemon emits processing results to stdout for a parent process; those results include the transcription and response data produced by processVoiceNote.
console.log(JSON.stringify({ type: 'voice-response', data: result }));Ensure the parent process is trusted, restrict log access, and redact or disable transcript logging if messages may contain sensitive content.
Dependency behavior could change over time, and first-run model/package downloads add normal supply-chain exposure.
The Python dependencies are specified with lower-bound ranges rather than exact pinned versions, so future installs may resolve to different package versions.
openai-whisper>=20231117 soundfile>=0.12.1 numpy>=1.21.0
Install in a virtual environment, pin known-good package versions, and review dependency provenance before production use.
If users add powerful custom handlers, anyone who can submit accepted voice messages might trigger those actions unless extra checks are added.
The documentation promotes extending voice intents into handlers that may affect devices, databases, or other systems. The bundled handlers are limited, but custom handlers could become high-impact.
IoT voice control (drones, smart home, etc.)
Use sender allowlists, confirmations, and least-privilege API keys for any custom handler that controls devices, accounts, or data.
