Tts Responder
Security checks across static analysis, malware telemetry, and agentic risk
Overview
This skill is coherent for text-to-speech Telegram replies, but users should notice that it uses local audio tools, optional Telegram bot credentials, and sends generated audio plus a short text caption to Telegram.
Before installing, confirm you trust the local Piper/ffmpeg setup, understand that Telegram voice replies send conversation content to Telegram, and configure BOT_TOKEN and CHAT_ID only for the intended bot and chat.
Static analysis
No static analysis findings were reported for this release.
VirusTotal
VirusTotal findings are pending for this skill version.
Risk analysis
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
The skill can run local audio conversion commands and create audio files on the machine.
The skill invokes local command-line tools to synthesize and convert audio. This is central to the stated TTS purpose and not suspicious by itself, but it means the agent can run those local tools when the skill is used.
piper --model "$VOICE" --output_file "$OUTPUT_WAV" ... ffmpeg -y -i "$OUTPUT_WAV" ... "$OUTPUT_OGG"
Install Piper and ffmpeg only from trusted sources, and use the skill only for text you are comfortable converting to audio.
If BOT_TOKEN and CHAT_ID are set, the skill can send messages as that Telegram bot to the configured chat.
The script uses a Telegram bot token and chat ID if present. That is expected for sending Telegram audio, but the registry metadata lists no required env vars or primary credential.
if [[ -n "${BOT_TOKEN:-}" && -n "${CHAT_ID:-}" ]]; then
curl -s -X POST "https://api.telegram.org/bot${BOT_TOKEN}/sendVoice"Use a bot token with only the permissions you need, keep it private, and confirm the CHAT_ID points to the intended chat.
Response text may be transmitted to Telegram as audio and partially as a caption when voice mode is enabled.
The generated audio file and a short caption derived from the response text are uploaded to Telegram. This matches the skill description, but it is an external data flow users should understand.
-F "voice=@${OUTPUT_OGG}" \
-F "caption=${TEXT:0:100}..."Avoid enabling voice replies for sensitive conversations unless you are comfortable sending that content through Telegram.
The first use may fetch an external voice model dependency.
The skill relies on Piper voice models that are downloaded automatically on first use. This is normal for TTS tooling, but the artifact does not specify the model source or pin a model version.
Los modelos de voz se descargan automáticamente al primer uso (unos 50 MB).
Use trusted Piper model sources and consider pinning or preinstalling the intended voice model in controlled environments.
