Ffmpeg Video To Mp3

Security checks across malware telemetry and agentic risk

Overview

This appears to be a real NemoVideo cloud skill, but it is much broader than a simple video-to-MP3 converter and needs careful review before use.

Review before installing. Use only if you are comfortable sending videos, URLs, prompts, and session data to NemoVideo, and treat generated claim links or tokens as sensitive credentials. Avoid sensitive or proprietary media unless you trust the service and understand its retention and access controls.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Findings (8)

Description-Behavior Mismatch

High
Confidence
96% confidence
Finding
The skill is presented as a narrow video-to-MP3 converter, but its routing instructions send many user requests into broad NemoVideo editing, generation, upload, state, and export workflows. This capability mismatch can cause users and host agents to disclose files, prompts, and actions to a much more powerful remote service than the skill description implies, increasing the risk of unintended data exposure and misuse.

Description-Behavior Mismatch

High
Confidence
97% confidence
Finding
The API section exposes a full remote media-editing platform, including session management, SSE messaging, rendering, state queries, and generalized media handling, which is far broader than simple MP3 extraction. A host may grant this skill more trust than warranted based on its benign title, allowing covert use of remote editing features under the guise of conversion.

Description-Behavior Mismatch

Medium
Confidence
88% confidence
Finding
The skill advertises support for a short list of video containers, but the documented backend accepts images and standalone audio formats as well. This discrepancy widens the effective input surface and may lead agents to upload content types users did not expect to share with this skill.

Context-Inappropriate Capability

Medium
Confidence
95% confidence
Finding
The skill instructs the agent to provision anonymous tokens and maintain a persistent client identifier in local config, even though the described task is simple file conversion. This introduces unnecessary authentication handling and persistent tracking state, which expands privacy and abuse risk beyond the user’s likely expectations.

Context-Inappropriate Capability

Medium
Confidence
90% confidence
Finding
Allowing URL-based ingestion means the skill can fetch remote media instead of only processing user-supplied local files. That creates unnecessary capability for a conversion skill and can be abused to pull third-party or sensitive content into the remote backend without clear user understanding.

Intent-Code Divergence

High
Confidence
98% confidence
Finding
The workflow maps export requests to an export section, but the documented export operation produces MP4 video output rather than MP3 audio. This is a direct functional mismatch that can cause users to unknowingly trigger broader video rendering behavior and upload/edit pipelines inconsistent with the stated purpose.

Intent-Code Divergence

High
Confidence
98% confidence
Finding
The API narrative claims the backend extracts audio to MP3, but the concrete export endpoint is defined to render MP4 output. This inconsistency indicates the skill may be masking a generic media-rendering service behind an audio-conversion label, which raises the chance of deceptive behavior and unintended processing of user media.

Missing User Warnings

Medium
Confidence
94% confidence
Finding
The setup instructions send tokens, messages, and uploaded media to a remote NemoVideo backend, but the skill description does not clearly warn users that their content is processed server-side. This undermines informed consent and may expose sensitive audio/video data to third-party infrastructure unexpectedly.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal