speech-translation

Security checks across malware telemetry and agentic risk

Overview

This speech-translation skill does what it claims, but its optional notification features can run unrestricted local shell commands and pass sensitive speech-derived text to external processes or services.

Install only if you are comfortable reviewing and controlling the command hooks yourself. Avoid --transcript-command, --translation-command, --audio-command, VOICE_TRANSLATE_TEXT_COMMAND_TEMPLATE, and VOICE_TRANSLATE_AUDIO_COMMAND_TEMPLATE unless the exact command is trusted. Prefer local/mock or agent-file translation for sensitive audio, and use service translation only with a trusted endpoint because transcript text is sent to that service.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Behavioral ASTexec() Call, eval() Call, Dynamic Import
  • MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Findings (15)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
print(command)
        return

    subprocess.run(command, shell=True, check=True)


if __name__ == "__main__":
Confidence
96% confidence
Finding
subprocess.run(command, shell=True, check=True)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
"No command template provided. Use --command-template or set VOICE_TRANSLATE_TEXT_COMMAND_TEMPLATE."
        )

    subprocess.run(args.command_template, input=message.encode("utf-8"), shell=True, check=True)


if __name__ == "__main__":
Confidence
97% confidence
Finding
subprocess.run(args.command_template, input=message.encode("utf-8"), shell=True, check=True)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
def _run_text_command(self, command: str | None, text: str) -> None:
        if not command:
            return
        subprocess.run(command, input=text.encode("utf-8"), shell=True, check=True)

    def _run_audio_command(self, command: str | None, audio_file: Path) -> None:
        if not command:
Confidence
97% confidence
Finding
subprocess.run(command, input=text.encode("utf-8"), shell=True, check=True)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
resolved = command.format(audio_file=str(audio_file))
        else:
            resolved = f'{command} "{audio_file}"'
        subprocess.run(resolved, shell=True, check=True)

    def notify_transcript(self, text: str) -> None:
        self._run_text_command(self.transcript_command, text)
Confidence
99% confidence
Finding
subprocess.run(resolved, shell=True, check=True)

Lp3

Medium
Category
MCP Least Privilege
Confidence
88% confidence
Finding
The skill advertises and instructs use of shell execution, file I/O, network access, and environment-dependent tooling, yet declares no permissions. That mismatch can cause the agent platform to run a capability-rich workflow without informed consent, sandboxing, or policy review, increasing the chance of unsafe execution paths being exposed.

Tp4

High
Category
MCP Tool Poisoning
Confidence
97% confidence
Finding
The description omits or downplays dangerous behaviors while the workflow supports arbitrary shell command templates for notifications and external HTTP translation services. This can mislead users or orchestrators into approving an audio-translation skill that actually enables command execution and data exfiltration of transcripts/audio to local shells or remote services, which is especially sensitive because voice data may contain private information.

Context-Inappropriate Capability

Medium
Confidence
87% confidence
Finding
This helper exposes arbitrary command execution through a configurable command template, which goes beyond narrowly sending an audio artifact and effectively turns the script into a generic command launcher. In an agent skill context, that increases abuse potential because automation may pass environment-controlled parameters without the operator realizing they enable arbitrary code execution.

Context-Inappropriate Capability

Medium
Confidence
90% confidence
Finding
This helper exposes an arbitrary command-execution hook for 'sending' text, which exceeds a narrowly scoped speech-translation utility and creates an unnecessary expansion of capability. In a skill that may process untrusted audio/transcript content and run unattended, such extensibility increases the blast radius and can be repurposed for executing unrelated system commands.

Context-Inappropriate Capability

Medium
Confidence
93% confidence
Finding
The CLI exposes user-controlled hooks to run arbitrary commands after transcript, translation, and audio generation. In a skill meant for speech translation, this creates a built-in execution primitive that can be wired to untrusted audio-derived content or file paths, enabling command execution or unsafe automation outside the stated core purpose.

Context-Inappropriate Capability

Medium
Confidence
88% confidence
Finding
The configuration includes generic command-execution hooks (`transcript_command`, `translation_command`, and `audio_command`) that allow runtime behavior to be delegated to arbitrary shell commands. In an audio translation skill, this creates a command-injection and unsafe-execution surface if any untrusted input, config file, or user-controlled parameter can influence these values or their arguments.

Context-Inappropriate Capability

Medium
Confidence
93% confidence
Finding
This notifier exposes arbitrary external command execution even though the stated skill purpose is transcription, translation, and speech synthesis. That broad capability expands the attack surface and allows the skill to act as a general command launcher or data exfiltration channel, especially for transcript and audio outputs. The mismatch between stated purpose and implemented power is security-relevant in an agent skill.

Missing User Warnings

Medium
Confidence
89% confidence
Finding
The documentation explicitly supports sending transcript text to an external HTTP translation service but does not warn that spoken content may contain sensitive data and will leave the local environment. In a speech-translation skill, transcripts often include private conversations, credentials, or personal information, so omission of privacy and trust-boundary guidance can cause unintended disclosure.

Missing User Warnings

Medium
Confidence
87% confidence
Finding
The notification hook documentation describes piping transcript and translation text to external commands and passing audio paths to helper processes without warning that sensitive content may be exposed to other local programs, logs, wrappers, or remote integrations those commands invoke. Because this skill processes speech-derived text and audio, these hooks extend the data exposure surface and can leak user content beyond the intended pipeline.

Missing User Warnings

Medium
Confidence
91% confidence
Finding
The code can send transcripts, translations, and audio-derived paths to external shell commands without any user-facing warning, confirmation, or transparency. In a speech translation workflow, these outputs may contain sensitive personal, business, or regulated information, so silent onward transmission materially increases privacy and security risk. The skill context makes undisclosed export of speech content particularly concerning.

Missing User Warnings

Medium
Confidence
95% confidence
Finding
This code can send full transcript text to an arbitrary external HTTP endpoint without any built-in disclosure, consent flow, or trust restriction. In a speech-translation skill, transcripts may contain sensitive spoken content, so silent transmission creates a real confidentiality risk, especially if the configured service is third-party or attacker-controlled.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal