smart-voice-reply

Security checks across malware telemetry and agentic risk

Overview

The skill provides voice replies as advertised, but its setup can persistently change the agent’s future behavior and under-discloses local logging and outbound media handling.

Install only if you intentionally want DashScope-based voice replies. Review any proposed USER.md edits before applying them, avoid enabling always-on voice replies unless that is truly desired, keep DASHSCOPE_API_KEY in environment or secret storage, and avoid synthesizing sensitive text because request data and audio metadata may be stored locally and sent to DashScope.

SkillSpector

By NVIDIA

Vulnerability Patterns

Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (7)

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: The installation guide instructs the agent to invoke another skill to create this skill and to alter the current agent's USER.md, which expands the skill's authority beyond voice reply into self-propagation and persistent configuration changes. This is dangerous because it can modify agent behavior without clear user consent, creating a supply-chain style persistence mechanism inside the agent configuration.

Context-Inappropriate Capability

Low

Confidence: 87% confidence
Finding: Requiring configuration of DASHSCOPE_API_KEY introduces credential handling and external service dependency that are not transparently justified in the narrow skill description. This increases risk of secret exposure or overbroad deployment, especially if users are pushed to add credentials during installation without understanding storage, scope, or necessity.

Vague Triggers

Medium

Confidence: 83% confidence
Finding: The invocation text is broad enough to trigger on generic voice-related requests, which can cause the skill to run in situations the user did not clearly intend. In this skill, that matters because execution can synthesize media and send it to a user, creating unintended actions and data handling beyond a simple informational response.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The skill instructs the agent to send generated voice media to a user and explicitly says not to return a user-facing success message, reducing transparency around a side-effecting outbound transmission. That is risky because users may not understand that content was synthesized and sent, and the hidden send behavior could be abused for unauthorized or surprising media delivery.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The guide directs modification of USER.md to load the skill automatically and force voice output on every reply, but does not warn that this persistently changes the current agent's behavior. Hidden or under-disclosed prompt/configuration changes are dangerous because they can silently alter future interactions, override user expectations, and make the skill effectively persistent beyond the immediate task.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: Telling users to configure DASHSCOPE_API_KEY without any warning about credential sensitivity omits essential security context. This can lead to unsafe sharing, insecure storage in prompts or documents, or accidental exposure of a live API secret during setup.

Missing User Warnings

Medium

Confidence: 86% confidence
Finding: The CLI transmits user-provided text and optional instructions to a remote third-party TTS service without any explicit disclosure at the point of use. This can expose sensitive or private content if users assume synthesis is local, especially in an agent-skill context where text may contain personal or confidential information.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal