Caption Generator Hindi

Security checks across malware telemetry and agentic risk

Overview

This skill is a cloud Hindi video captioning workflow that is broadly coherent, but users should know their media, links, and prompts go to NemoVideo for processing.

Install only if you are comfortable sending selected videos, media URLs, prompts, and render state to NemoVideo.ai for cloud processing. Avoid sensitive or confidential footage unless you trust that provider's privacy and retention practices, and give specific caption-only instructions if you do not want broader video edits.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (6)

Description-Behavior Mismatch

Medium

Confidence: 88% confidence
Finding: The skill is presented as a narrowly scoped Hindi caption generator, but the instructions broaden it into general-purpose video editing, routing requests for overlays, audio, aspect ratio changes, and other edits. This scope expansion increases the chance the skill will capture and process unrelated user requests and media in ways the user did not reasonably expect from the manifest description.

Description-Behavior Mismatch

Low

Confidence: 81% confidence
Finding: The upload/export section allows handling of many media types beyond the stated video-input scope, creating a mismatch between what the skill advertises and what it can ingest or emit. Even if not directly exploitable as code execution, this broadens data exposure and can lead users to submit files they would not have shared with a caption-only video tool.

Context-Inappropriate Capability

Medium

Confidence: 93% confidence
Finding: Allowing arbitrary URL-based uploads lets the skill cause the backend to fetch remote content that is outside the expected user-upload workflow. This can enable misuse such as fetching third-party or sensitive URLs, create compliance/privacy issues, and expand the trust boundary without clear need for a captioning skill.

Vague Triggers

Medium

Confidence: 90% confidence
Finding: The catch-all routing rule sends 'everything else' to the SSE backend, making the skill eligible to act on almost any prompt rather than only caption-related requests. Overbroad invocation increases the risk of unintended tool activation, data transmission to the remote service, and use outside the least-privilege scope implied by the skill metadata.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The skill instructs the agent to automatically connect to an external backend and even obtain an anonymous token before handling any request, without first informing the user that content and metadata will be sent off-platform. This undermines informed consent and can expose user identifiers, prompts, and subsequent media-processing activity to a third-party service unexpectedly.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The upload and export workflow sends user media to remote processing infrastructure and notes that jobs may persist if the tab closes, yet it does not require an explicit warning or consent step. For a media-processing skill, this is significant because videos often contain sensitive personal, biometric, or copyrighted content, and users may not realize cloud rendering and job persistence are involved.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal