Voice Agent
AdvisoryAudited by Static analysis on Apr 30, 2026.
Overview
No suspicious patterns detected.
Findings (0)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
A chosen audio file is handed to the local backend, and an output path can be created or overwritten by the synthesis command.
The client reads a provided audio file for transcription and writes synthesized audio to a provided output path. This is expected for the skill, but users should ensure only intended files and output locations are used.
with open(filename, 'rb') as f: data += f.read() ... with open(output_file, 'wb') as f:
Use explicit, non-sensitive audio inputs and safe output paths; avoid pointing the output at existing important files.
The skill’s safety depends partly on the backend service running on localhost:8000, not just on the packaged client script.
The skill is client-only and relies on a separately managed backend and repository docs outside the included package. That dependency is disclosed, but the backend is part of the trust decision.
Requires a running backend API at `http://localhost:8000`. Backend setup instructions are in this repository: - `README.md` - `walkthrough.md` - `DOCKER_README.md`
Install and run the backend only from a trusted source, review its setup instructions, and avoid running an unexpected service on localhost:8000.
Text sent for speech generation may be handled beyond the local machine by AWS Polly through the backend.
The skill discloses that text-to-speech uses AWS Polly via the backend. That is purpose-aligned, but synthesis text may be processed by an external provider depending on backend configuration.
It uses **local Whisper** for Speech-to-Text transcription and **AWS Polly** for Text-to-Speech generation.
Do not synthesize highly sensitive text unless you are comfortable with the backend and AWS Polly handling it.
