Telegram - Conversa por Áudio (PICOCLAW)

Security checks across malware telemetry and agentic risk

Overview

This skill does what it claims for Telegram voice processing, but it needs review because it automatically processes private voice messages, sends them to outside speech services, and keeps transcripts locally with incomplete lifecycle controls.

Install only if you are comfortable with a background watcher processing Telegram voice files, uploading audio/text to Groq and Edge TTS, and storing transcripts locally. Before enabling it, define who has consented to voice processing, restrict the watched folder or chat scope, add a clear stop procedure, and decide how transcript JSON files and logs should be deleted or protected.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Rogue AgentSelf-Modification, Session Persistence
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (8)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 91% confidence
Finding: The skill clearly instructs use of environment variables, filesystem access, background execution, and shell commands, but it does not declare any permissions or capability boundaries. This creates an authorization and review gap: operators may approve or run a skill without understanding that it can access secrets, read/write local media, and launch persistent processes.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The documentation describes automatic deletion of user audio and metadata after a 15-day retention period, but it does not warn operators or end users about irreversible data loss or clarify whether this may remove records needed for troubleshooting, consent, or audit purposes. In a Telegram voice-processing skill, these files are user-generated content and associated metadata, so silent retention-based deletion creates a real risk of unexpected loss and weakens transparency around data handling.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The skill sends user audio to an external transcription provider (GROQ Whisper) automatically, but the documentation does not warn users that voice content may leave the local system for third-party processing. In a messaging context, audio can contain sensitive personal or business information, so lack of notice and consent increases privacy and compliance risk.

Missing User Warnings

Low

Confidence: 84% confidence
Finding: The skill documents automatic deletion of audio files after 15 days, but it does not clearly inform users about retention duration or that their data will be removed automatically. This is primarily a transparency and data-governance issue that could cause unexpected loss of records or conflict with user expectations.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The watcher copies received audio into a persistent done directory and writes full transcriptions plus source paths and chat metadata into JSON files on disk. In a Telegram voice-processing skill, this creates a clear privacy and data-retention risk because sensitive user speech is stored locally without any visible consent, minimization, or retention controls in this component.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The code sends audio content to an external transcription provider using a server-side API key, which means user voice data leaves the local system and is processed by a third party. In this skill context, that is security-relevant because voice messages can contain sensitive personal or business information, and there is no visible notice, policy check, or safeguard here before transfer.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The script reads the entire local audio file and sends its contents to Groq's external transcription service, which creates a real data-transfer and privacy boundary crossing. In a Telegram voice-processing skill, users may reasonably expect local handling unless the network transmission is disclosed, so sensitive voice content could be transmitted off-platform without informed consent or clear operator awareness.

Session Persistence

Medium

Category: Rogue Agent
Content: ### 1. Iniciar o Watcher (Semiautomático) O watcher roda em background para processar novos áudios: ```bash nohup python3 /root/.picoclaw/workspace/skills/telegram-native-audio/scripts/semi_auto_watcher.py > /root/.picoclaw/workspace/skills/telegram-native-audio/scripts/semi_auto_watcher.log 2>&1 & ``` ### 2. Responder a uma pendência de áudio
Confidence: 78% confidence
Finding: nohup

VirusTotal

67/67 vendors flagged this skill as clean.

View on VirusTotal