Voice.ai: Creator Voiceover Forge

Security checks across malware telemetry and agentic risk

Overview

This is a coherent Voice.ai voiceover tool, but its generated ffmpeg helper scripts can turn specially crafted media file paths into shell commands if a user runs those scripts.

Install only if you are comfortable sending script text to Voice.ai and storing generated outputs that may include the original script. Avoid running the generated ffmpeg helper scripts on media files with untrusted or unusual filenames; install ffmpeg and use the direct CLI path where possible, or inspect/quote the generated scripts before running them.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep

Findings (7)

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The documentation describes sending arbitrary `text` to remote Voice.ai TTS endpoints but does not clearly warn that script content will leave the local environment and be transmitted to a third-party service. In a voiceover pipeline, users may submit unpublished scripts, sensitive internal copy, or personal data, so the omission can lead to unintended data disclosure through normal use.

Missing User Warnings

Medium

Confidence: 87% confidence
Finding: The code sends arbitrary user-supplied text to the external Voice.ai `/tts/speech` API, which is a real third-party data transfer. Even though this is expected for a TTS integration, the file contains no consent gate, sensitivity check, redaction, or disclosure mechanism, so users may unknowingly transmit confidential or regulated content off-platform.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The command reads arbitrary script content and sends it to the Voice.ai TTS service via renderSegments/client without any explicit notice or confirmation that the input will leave the local machine. This can lead to unintended disclosure of sensitive or proprietary text if a user assumes processing is local, especially in a CLI workflow handling drafts, internal documents, or confidential scripts.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The generated Bash and PowerShell helper scripts embed file paths directly into executable command text without robust shell-specific escaping. If an attacker can influence videoPath, audioPath, or outputPath, they may inject shell metacharacters or quoting breaks into the generated script, leading to arbitrary command execution when a user runs that script; this skill's context makes that realistic because it processes externally supplied media paths and then asks users to execute generated scripts.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The manifest intentionally serializes each segment's full source_text into manifest.json, which can expose the complete input script to anyone with access to the output artifacts. In this skill's context, outputs are designed for sharing and publication workflows, so embedding raw script content in generated files increases the chance of accidental disclosure of proprietary, sensitive, or unreviewed text.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The review HTML renders the full segment text inside the deliverable page, making the original script directly visible to anyone who opens or receives the review artifact. Although the text is HTML-escaped and this is not an XSS issue, it is still a confidentiality problem because review pages are likely to be shared with collaborators or clients and may expose more script content than intended.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: The function sends segment.text to an external Voice.ai TTS service, which can expose sensitive or proprietary script contents to a third party if users are unaware of the off-box processing. In this skill's context, remote TTS is expected functionality, which reduces suspicion of malicious intent, but the absence of explicit disclosure or consent still creates a real privacy and data-handling risk.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal