Telegram - Conversa por Áudio (PICOCLAW)

Security checks across malware telemetry and agentic risk

Overview

This skill does what it claims for Telegram voice processing, but it needs review because it automatically processes private voice messages, sends them to outside speech services, and keeps transcripts locally with incomplete lifecycle controls.

Install only if you are comfortable with a background watcher processing Telegram voice files, uploading audio/text to Groq and Edge TTS, and storing transcripts locally. Before enabling it, define who has consented to voice processing, restrict the watched folder or chat scope, add a clear stop procedure, and decide how transcript JSON files and logs should be deleted or protected.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Rogue AgentSelf-Modification, Session Persistence
  • MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Findings (8)

Lp3

Medium
Category
MCP Least Privilege
Confidence
91% confidence
Finding
The skill clearly instructs use of environment variables, filesystem access, background execution, and shell commands, but it does not declare any permissions or capability boundaries. This creates an authorization and review gap: operators may approve or run a skill without understanding that it can access secrets, read/write local media, and launch persistent processes.

Missing User Warnings

Medium
Confidence
88% confidence
Finding
The documentation describes automatic deletion of user audio and metadata after a 15-day retention period, but it does not warn operators or end users about irreversible data loss or clarify whether this may remove records needed for troubleshooting, consent, or audit purposes. In a Telegram voice-processing skill, these files are user-generated content and associated metadata, so silent retention-based deletion creates a real risk of unexpected loss and weakens transparency around data handling.

Missing User Warnings

Medium
Confidence
93% confidence
Finding
The skill sends user audio to an external transcription provider (GROQ Whisper) automatically, but the documentation does not warn users that voice content may leave the local system for third-party processing. In a messaging context, audio can contain sensitive personal or business information, so lack of notice and consent increases privacy and compliance risk.

Missing User Warnings

Low
Confidence
84% confidence
Finding
The skill documents automatic deletion of audio files after 15 days, but it does not clearly inform users about retention duration or that their data will be removed automatically. This is primarily a transparency and data-governance issue that could cause unexpected loss of records or conflict with user expectations.

Missing User Warnings

Medium
Confidence
91% confidence
Finding
The watcher copies received audio into a persistent done directory and writes full transcriptions plus source paths and chat metadata into JSON files on disk. In a Telegram voice-processing skill, this creates a clear privacy and data-retention risk because sensitive user speech is stored locally without any visible consent, minimization, or retention controls in this component.

Missing User Warnings

Medium
Confidence
94% confidence
Finding
The code sends audio content to an external transcription provider using a server-side API key, which means user voice data leaves the local system and is processed by a third party. In this skill context, that is security-relevant because voice messages can contain sensitive personal or business information, and there is no visible notice, policy check, or safeguard here before transfer.

Missing User Warnings

Medium
Confidence
90% confidence
Finding
The script reads the entire local audio file and sends its contents to Groq's external transcription service, which creates a real data-transfer and privacy boundary crossing. In a Telegram voice-processing skill, users may reasonably expect local handling unless the network transmission is disclosed, so sensitive voice content could be transmitted off-platform without informed consent or clear operator awareness.

Session Persistence

Medium
Category
Rogue Agent
Content
### 1. Iniciar o Watcher (Semiautomático)
O watcher roda em background para processar novos áudios:
```bash
nohup python3 /root/.picoclaw/workspace/skills/telegram-native-audio/scripts/semi_auto_watcher.py > /root/.picoclaw/workspace/skills/telegram-native-audio/scripts/semi_auto_watcher.log 2>&1 &
```

### 2. Responder a uma pendência de áudio
Confidence
78% confidence
Finding
nohup

VirusTotal

67/67 vendors flagged this skill as clean.

View on VirusTotal