Tomoviee Video Background Music

Security checks across malware telemetry and agentic risk

Overview

This skill is a straightforward Tomoviee video-soundtrack API helper, with normal external media-processing and credential-handling risks but no evidence of hidden, destructive, or deceptive behavior.

Install only if you are comfortable sending video URLs and prompts to Tomoviee/Wondershare for processing. Use public or non-sensitive media where possible, avoid exposing generated Basic auth tokens in logs or shared shells, and treat the broad reference docs as background rather than permission for unrelated APIs.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Findings (9)

Lp3

Medium
Category
MCP Least Privilege
Confidence
90% confidence
Finding
The skill documentation describes use of an external API client, video URLs, and polling workflow, which clearly implies network access, yet no permissions are declared. This creates a transparency and governance gap: users or hosting platforms may not realize the skill sends data to third-party services, including user-provided video URLs and prompts.

Description-Behavior Mismatch

Medium
Confidence
95% confidence
Finding
The reference file documents multiple capabilities—text-to-music, sound effects, and text-to-speech—despite the skill being declared as a video-scoring skill. This scope expansion increases the chance the agent may invoke broader audio-generation features than users or reviewers expect, weakening least-privilege boundaries and enabling misuse beyond the advertised purpose.

Context-Inappropriate Capability

Medium
Confidence
95% confidence
Finding
Documenting text-to-speech in a video-scoring skill introduces an unrelated content-generation pathway that could be used to synthesize spoken content without the user's expectation or explicit consent. In an agentic setting, mismatched capability docs can lead to unauthorized feature use, data exfiltration to external TTS services, or policy bypass through undeclared modalities.

Context-Inappropriate Capability

Medium
Confidence
92% confidence
Finding
Sound-effect generation is unrelated to the declared video-soundtrack purpose and broadens the operational surface of the skill. This can cause the agent to perform undeclared actions or process user inputs in ways not covered by the skill's stated scope, undermining trust and permission boundaries.

Context-Inappropriate Capability

Low
Confidence
86% confidence
Finding
General text-to-music generation exceeds the narrow declared purpose of generating soundtracks tailored to video content. While less severe than TTS, it still signals capability drift and may allow the skill to generate standalone music outside the user-approved workflow.

Missing User Warnings

Medium
Confidence
90% confidence
Finding
The documentation instructs use of callback URLs and publicly accessible video URLs but provides no warning that user media, metadata, and task results are sent to an external service and potentially to third-party callback endpoints. This creates privacy and data-handling risk, especially if users provide sensitive or non-public video content.

Missing User Warnings

Medium
Confidence
93% confidence
Finding
The video soundtrack feature explicitly analyzes video scenes, pacing, and mood, which means user-provided media content is processed by an external system without any accompanying privacy or data-use notice. For a media-analysis workflow, omission of disclosure is risky because videos may contain personal, confidential, or copyrighted content.

Missing User Warnings

Medium
Confidence
95% confidence
Finding
The script prints a credential-derived Basic auth token directly to stdout, which can expose secrets through terminal scrollback, shell session recording, CI logs, or copied command output. In this context the token is just base64-encoded app_key:app_secret, so disclosure effectively reveals the underlying credentials to anyone who can access the output.

Credential Access

High
Category
Privilege Escalation
Content
app_secret = sys.argv[2]
    
    token = generate_access_token(app_key, app_secret)
    print(f"Access Token: {token}")
    print(f"\nUse in Authorization header as: Basic {token}")
Confidence
90% confidence
Finding
Access Token

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal