Feishu Voice Clone TTS Skill

Security checks across malware telemetry and agentic risk

Overview

This skill does what it claims: it turns text into speech with Volcengine and sends the audio to Feishu, but users should treat the external sending and voice-cloning features carefully.

Install only if you are comfortable sending the spoken text to Volcengine and the generated audio to Feishu. Keep VOLC_TTS_URL unset or restricted to the official trusted endpoint, use least-privileged Feishu credentials, protect any ~/.volcengine_key file, confirm the target chat before sending, and use cloned voices only with proper consent.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Taint TrackingDirect Taint Flow, Variable-Mediated Taint Flow, Credential Exfiltration Chain
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (5)

Tainted flow: 'VOLC_TTS_URL' from os.environ.get (line 22, credential/environment) → requests.post (network output)

Critical

Category: Data Flow
Content: } print(f"Generating voice for: {text}") response = requests.post(VOLC_TTS_URL, headers=headers, json=payload, timeout=60) if response.ok: result = response.json()
Confidence: 97% confidence
Finding: response = requests.post(VOLC_TTS_URL, headers=headers, json=payload, timeout=60)

Context-Inappropriate Capability

Medium

Confidence: 80% confidence
Finding: The skill reads a local fallback secret file from ~/.volcengine_key, introducing an additional secret source that may bypass expected secret-management controls. In agent environments, undeclared local secret-file access increases the chance of unreviewed credential use or accidental cross-context secret exposure.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The README instructs users to send text to Volcengine TTS and then transmit the resulting audio to Feishu, but it does not clearly disclose that message content and derived voice/audio leave the local environment and are processed by third-party services. Because the skill also supports cloned voices, the lack of warnings about consent, impersonation, privacy, and regulatory risks can lead users to misuse the skill or expose sensitive content unintentionally.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The skill requires sensitive credentials and sends user-provided text to Volcengine for TTS and then delivers the resulting audio to Feishu, but it does not disclose these third-party data flows or the side effect of sending messages into chats. This can lead users to unknowingly transmit sensitive content or trigger unintended outbound communications, which is a real security and privacy risk in an agent context.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: User-provided text is sent to an external TTS provider without any consent prompt, privacy notice, or content sensitivity check. If users assume local processing, confidential or regulated text could be exposed to a third party unexpectedly.

VirusTotal

VirusTotal findings are pending for this skill version.

View on VirusTotal