Strudel Music

Security checks across malware telemetry and agentic risk

Overview

The skill is mostly a disclosed music-rendering tool, but its Discord playback path directly loads local bot credentials despite saying no Discord credentials are required.

Install only if you are comfortable with trusted JavaScript compositions running as code and with optional Discord playback using local bot credentials. Review or disable vc-play.mjs before using voice-channel features, use a dedicated low-privilege Discord bot token, avoid untrusted composition files, and restrict sample downloads with STRUDEL_ALLOWED_HOSTS and a smaller STRUDEL_MAX_DOWNLOAD_MB.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (19)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 94% confidence
Finding: The skill clearly exposes shell-capable operations such as `npm install`, bash scripts, Node execution, Python/Demucs commands, ffmpeg, and background exec, yet no explicit permissions model is declared. That mismatch can cause the hosting platform or user to underestimate the skill's ability to execute commands, access files, and reach the network, increasing the chance of unsafe deployment or misuse.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 92% confidence
Finding: The declared description focuses on audio deconstruction/composition, but the document also instructs the agent to download arbitrary sample packs from user-supplied URLs, stream to Discord voice channels, and perform additional operational behaviors. This expands the attack surface beyond what a user would reasonably expect, especially because arbitrary downloads and execution-adjacent workflows can introduce malicious content, data exfiltration paths, or abuse of authenticated platform integrations.

Intent-Code Divergence

Medium

Confidence: 81% confidence
Finding: The security section claims the skill only hands audio to OpenClaw's voice subsystem, but other instructions show direct execution of `node scripts/vc-play.mjs`, implying the repository contains its own voice-channel streaming logic. This inconsistency can mislead reviewers about where authentication, network access, and message/voice actions actually occur, making it harder to assess trust boundaries and increasing the risk of unauthorized or poorly understood Discord interactions.

Description-Behavior Mismatch

Medium

Confidence: 89% confidence
Finding: The lockfile shows Discord bot and voice-networking dependencies such as discord.js and @discordjs/voice even though the skill is described as audio deconstruction/composition. That mismatch expands the skill's effective capability to remote communication and voice-channel interaction, which increases attack surface and could enable unexpected data exfiltration, command-and-control, or unauthorized network access if the skill is installed or executed in a broader agent environment.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: The package declares Discord bot/voice, crypto, and dotenv-related capabilities that materially expand the attack surface beyond the stated core purpose of offline audio deconstruction/composition. While these may support an optional 'concert' or remote playback feature, their presence enables networked interaction, secret handling, and voice-channel connectivity that could be abused if the skill is installed or run in a broader agent environment.

Description-Behavior Mismatch

Medium

Confidence: 94% confidence
Finding: The documentation claims the pipeline is entirely local and offline, but elsewhere states that samples are automatically downloaded from GitHub. This is a real security-relevant discrepancy because operators may make deployment and trust decisions based on the offline claim, while the implementation still performs network access and imports third-party content.

Description-Behavior Mismatch

Medium

Confidence: 88% confidence
Finding: The document presents the system as an offline rendering pipeline, but it also includes Discord voice-channel streaming, which is a networked capability using an authenticated gateway session. This mismatch can mislead reviewers into underestimating data egress, account-scope, and abuse potential of the skill.

Context-Inappropriate Capability

Medium

Confidence: 92% confidence
Finding: This script loads Discord bot credentials from local environment files and uses them to connect to a Discord voice channel, which is unrelated to the declared Strudel audio deconstruction/composition functionality. In a skill package, adding out-of-scope networked bot control expands the trust boundary and can enable unauthorized use of existing bot credentials and external communications.

Description-Behavior Mismatch

High

Confidence: 95% confidence
Finding: The file's stated purpose is to play audio into a Discord voice channel, which materially differs from the skill description of offline audio deconstruction, sample extraction, composition, and rendering. This mismatch is dangerous because hidden or undisclosed capabilities can cause users to grant access to a skill under false assumptions, including network access and use of sensitive Discord credentials.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: The composition hardcodes a remote asset host (`https://ronan.dandelion.cult:8080/...`) for required audio slices, creating an external runtime dependency outside the local skill boundary. This can enable tracking, content substitution, or delivery of unexpected/malicious media if the remote host is compromised or changed, and it also weakens reproducibility and offline safety expectations for a local audio composition skill.

Description-Behavior Mismatch

Medium

Confidence: 93% confidence
Finding: The file explicitly frames itself as a faithful clone of a named artist's work and emphasizes reproducing the original's 'statistical DNA' rather than performing the skill's stated deconstruction/composition workflow. In this context, the risk is not code execution but intellectual-property and policy exposure: the skill appears designed to imitate a specific artist's style and potentially misrepresent derivative output as original or tool-generated composition.

Intent-Code Divergence

Low

Confidence: 83% confidence
Finding: The comments overstate the file as the 'closest we can get to playing back the original' and a faithful clone, while the code only encodes an approximate pattern. This mismatch can mislead users about provenance, fidelity, and permissibility, increasing legal and trust risk by presenting imitation as near-reproduction without substantiation.

Context-Inappropriate Capability

High

Confidence: 99% confidence
Finding: The renderer reads an external pattern file and executes it with `new Function(...)`, which is arbitrary JavaScript execution in the Node.js process. Because this runtime also exposes globals and has access to filesystem-capable Node primitives in scope, a crafted pattern can run OS commands, read or modify files, or exfiltrate data instead of merely describing music.

Context-Inappropriate Capability

High

Confidence: 99% confidence
Finding: The renderer evaluates attacker-controlled composition files with new Function() in the current Node.js process, which gives the input arbitrary JavaScript execution rather than a constrained music DSL. The code itself admits this is not a sandbox and that evaluated patterns can still reach filesystem and network capabilities, so an untrusted composition can execute commands, read/write files, or exfiltrate data with the privileges of the renderer process.

Intent-Code Divergence

Medium

Confidence: 94% confidence
Finding: The hardening comments and partial scrubbing can create a false sense of safety even though the code explicitly states patterns can still access fs and network. This is dangerous because maintainers or users may treat the renderer as reasonably sandboxed when it is still effectively arbitrary code execution with broad ambient privileges.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The onboarding guide explicitly encourages users to decompose arbitrary audio into stems, extract samples, and reuse them in compositions, but it provides no caution about copyright, consent, licensing, or privacy-sensitive recordings. In a skill designed for automated audio processing, this omission can facilitate unauthorized derivative use of copyrighted works or processing of private voice/audio content, increasing legal and privacy risk.

Missing User Warnings

Low

Confidence: 90% confidence
Finding: Automatically downloading samples from GitHub without a clear user-facing warning creates an undisclosed outbound network action and introduces supply-chain risk from remote content. In an agent skill advertised as local/offline, silent fetching is especially problematic because it violates user expectations and may bypass network restrictions or policy review.

Missing User Warnings

High

Confidence: 98% confidence
Finding: The code executes an untrusted input file directly but the CLI usage and behavior do not clearly warn users that opening a pattern file results in code execution. In this skill context, users may reasonably expect a music pattern renderer, not a general JavaScript runner, which increases the chance they will process third-party compositions and trigger malicious code.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The script executes untrusted composition code in-process but only logs normal rendering progress, without a prominent warning that opening a composition file can run arbitrary JavaScript. That weak user signaling increases the likelihood that operators will feed it untrusted files, turning a dangerous design into a practical exploitation path.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal