language-helper

Security checks across malware telemetry and agentic risk

Overview

This is a disclosed language-learning text-to-speech skill, with optional Feishu voice-message delivery that users should configure carefully.

Install only if you are comfortable sending phrases to SenseAudio for speech generation. If you enable Feishu, use a least-privilege bot, verify FEISHU_CHAT_ID before use, and avoid sensitive text because generated audio can be uploaded and posted to that chat. Keep the .env file private and avoid --debug-log for confidential content.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (10)

Context-Inappropriate Capability

Medium

Confidence: 88% confidence
Finding: The skill introduces Feishu credentials and message-sending capability that is not clearly justified by the declared purpose. That expands the attack surface to third-party chat delivery and creates a risk of sending user content to external recipients or workspaces without sufficiently prominent disclosure or narrowly scoped authorization.

Description-Behavior Mismatch

High

Confidence: 95% confidence
Finding: This file adds Feishu chat enumeration and message-sending capabilities that materially exceed the declared purpose of a language-learning skill. In a skill that is supposed to translate text, generate pronunciation audio, and explain grammar, the ability to discover chats and send audio into arbitrary bot-accessible conversations creates an unnecessary outbound communication channel that could be abused for spam, data exfiltration, or covert messaging.

Context-Inappropriate Capability

High

Confidence: 97% confidence
Finding: The list_chats() function enumerates all chats joined by the bot, which is not justified by the skill's stated language-learning purpose. This expands the skill's visibility into organizational messaging metadata and, when combined with send_audio_message(), enables targeting of conversations for unauthorized outbound content or reconnaissance.

Context-Inappropriate Capability

Medium

Confidence: 75% confidence
Finding: The concat command expands the skill beyond simple language-learning TTS into a general-purpose local media manipulation tool. In agent contexts, this broader capability can be abused to process arbitrary local files and overwrite outputs, increasing the blast radius beyond the declared purpose and user expectation.

Description-Behavior Mismatch

Medium

Confidence: 87% confidence
Finding: The skill includes Feishu chat discovery, file upload, and outbound message-sending capabilities that materially exceed a narrowly scoped language-learning/TTS workflow. Extra messaging capabilities increase the attack surface and create opportunities for unintended data disclosure or misuse if the skill is invoked in broader contexts.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: Enumerating all Feishu chats the bot has joined exposes organizational metadata such as chat IDs, names, and communication topology that is not necessary for basic language learning. In a compromised or over-permissioned deployment, this can aid targeting, privacy violations, or misuse of the bot for unsolicited messaging.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The skill sends user-provided text to an external TTS service but does not present an explicit privacy warning at the point of use. This is risky because language-learning prompts may contain sensitive personal or business text, and users may reasonably assume local processing unless third-party transmission is clearly disclosed.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: Failure debug logging persists full text variants, request payloads, and HTTP response bodies to disk. Since this skill handles user-provided text and authentication headers are involved, debug artifacts can retain sensitive user content or service responses longer than intended and expose them to other local processes or users.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The ffmpeg command uses -y, which silently overwrites the output path. If an untrusted caller can influence args.output, this can destroy existing files without confirmation, which is particularly risky in an agent or automation environment where file paths may be composed indirectly.

Missing User Warnings

Medium

Confidence: 84% confidence
Finding: The code uploads synthesized audio derived from user text to Feishu and sends it externally without any explicit notice, confirmation, or policy check at the send point. Because language-learning content may contain private phrases, names, or sensitive business text, silent transmission to a third-party platform creates a real privacy and data-handling risk.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal