Music Craft

Security checks across malware telemetry and agentic risk

Overview

Music Craft is a disclosed music-generation workflow with privacy-sensitive enrichment and setup steps, but no evidence of hidden, destructive, or deceptive behavior.

Install only if you are comfortable with a music workflow that may use web search/fetch, image analysis, memory lookup, local model installs, and optional cloud APIs. For private lyrics or sensitive prompts, use a local backend and ask the agent not to use web, browser, memory, or cloud providers unless you explicitly approve them.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Findings (10)

Description-Behavior Mismatch

Medium
Confidence
94% confidence
Finding
The skill's declared purpose is music generation, but it instructs the agent to perform broad system inspection, package-manager detection, local installs, web browsing, and memory access. That scope expansion violates least privilege and increases the chance of unnecessary access to user data, system state, and network resources beyond what is needed to generate music.

Description-Behavior Mismatch

High
Confidence
89% confidence
Finding
The skill says cover/style-transfer workflows are out of scope, but later includes ACE-Step cover, repaint, extraction, and audio-understanding instructions. This inconsistency can cause the agent to invoke higher-risk audio-processing capabilities the user was told would not be used, defeating user expectations and safety boundaries.

Intent-Code Divergence

High
Confidence
91% confidence
Finding
The document explicitly says audio analysis and cover workflows are not covered, then later documents those exact capabilities through ACE-Step. Contradictory instructions are dangerous because they undermine policy enforcement and may route the agent into handling user audio files and analysis tasks without proper consent or the expected safeguards.

Context-Inappropriate Capability

Medium
Confidence
95% confidence
Finding
The skill directs the agent to inspect durable user memory for prior preferences even though that is not necessary for basic music generation. Accessing persistent memory without a clear need or explicit consent creates a privacy risk and can expose unrelated personal data to the workflow.

Description-Behavior Mismatch

Medium
Confidence
89% confidence
Finding
The guidance tells the skill to enrich requests from images or URLs, which expands behavior beyond a pure music-generation workflow into external content interpretation. That increases attack surface because URLs and linked content can introduce untrusted data handling, privacy leakage, or unintended downstream tool use that the skill description does not clearly scope or constrain.

Missing User Warnings

Medium
Confidence
96% confidence
Finding
The markdown instructs reading durable memory without an upfront privacy warning or explicit consent checkpoint. Even if used for convenience, this weakens transparency and can surprise users by pulling prior personal context into a new task.

Missing User Warnings

Medium
Confidence
93% confidence
Finding
The skill encourages web fetching and browser use on user-provided links without a prominent upfront disclosure that external network access will occur. That can leak metadata, retrieve untrusted content, and expand the attack surface through remote pages or browser automation beyond user expectations.

Missing User Warnings

Low
Confidence
79% confidence
Finding
The workflow instructs the agent to fetch third-party lyrics URLs and process their content without clearly requiring user-facing notice that external tools will access the provided URL and potentially retrieve page contents. In an agent environment, this can expose user-provided links or associated content to external fetch/browser tooling unexpectedly, creating a privacy and data-handling risk even if the content is not highly sensitive.

Natural-Language Policy Violations

Medium
Confidence
89% confidence
Finding
The worked example hard-codes 'bright female vocal in English' for an ambiguous request ('Make me a song') without indicating that language or vocal characteristics should come from user preference. This can cause the agent to inject unrequested demographic and language defaults, leading to output misalignment and potentially inappropriate assumptions in downstream generation.

Natural-Language Policy Violations

Medium
Confidence
83% confidence
Finding
Several sub-genre examples prescribe specific vocal languages or locales such as Spanish, Portuguese, Jamaican, or mixed-language vocals as defaults. In a music-generation skill, these defaults can steer outputs toward unrequested cultural or linguistic assumptions, which is not a code-execution risk but is still a prompt-safety and user-intent integrity issue.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal