本地视频字幕提取与翻译工具

Security checks across malware telemetry and agentic risk

Overview

This subtitle skill performs the advertised media transcription and translation workflow, but users should know selected audio and transcript text are sent to external AI services.

Install only if you are comfortable sending the selected media file to Groq and transcript chunks to the configured LLM provider. Avoid sensitive or regulated recordings unless those providers' privacy and retention terms are acceptable, keep API keys in environment variables, and run the Python dependencies in an isolated environment.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (8)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 93% confidence
Finding: The skill clearly instructs the agent to read environment variables, invoke remote APIs, and write subtitle files, yet it declares no explicit permissions or trust boundaries. This creates a transparency and governance gap: an agent or user may unknowingly authorize network exfiltration of local audio and file writes without an upfront permission model.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 96% confidence
Finding: The skill description understates important behavior by presenting the tool as local FFmpeg plus Python processing while the workflow sends audio and subtitle content to external services for transcription and translation. This mismatch is security-relevant because users may provide sensitive media under the false impression that processing is local-only, leading to unintended disclosure of private content.

Description-Behavior Mismatch

High

Confidence: 98% confidence
Finding: The skill description claims a local FFmpeg/local-script workflow, but the code uploads audio and subtitle text to external Groq and LLM services. This mismatch can mislead users into providing sensitive media under the false assumption that processing stays local, creating a meaningful privacy and data-governance risk.

Context-Inappropriate Capability

Medium

Confidence: 89% confidence
Finding: The skill depends on environment-provided API keys and remote services despite being presented as a local automation tool. That undisclosed dependency changes the trust boundary and can cause operators to expose credentials and data to third parties without informed consent.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The markdown directs the agent to transmit local audio and derived subtitle text to external APIs but provides no privacy warning, retention notice, or consent step. For audio/video content, this can expose sensitive speech, personal information, or confidential business material to third parties, making the omission materially dangerous in context.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The code sends the full local audio file to an external transcription service without any explicit user-facing notice or consent mechanism. In a subtitle-processing skill, audio may contain confidential conversations or regulated content, so undisclosed off-box transfer materially increases privacy and compliance risk.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: Extracted transcript/subtitle content is forwarded to a second external LLM service for translation without a clear disclosure to the user. This creates another undisclosed data-sharing path and may expose sensitive spoken content to an additional third party.

Ssd 1

Medium

Confidence: 91% confidence
Finding: Untrusted transcript text derived from audio is inserted directly into the translator model as user content, so spoken or embedded prompt-injection content can instruct the model to ignore formatting rules, alter output, or leak surrounding context. In this skill, that can corrupt subtitle integrity and potentially trigger unexpected downstream behavior if consumers trust the translated SRT format.

VirusTotal

67/67 vendors flagged this skill as clean.

View on VirusTotal