Digital Singer

Security checks across malware telemetry and agentic risk

Overview

The skill is mostly an avatar karaoke app, but it handles voice audio and API keys in ways that need careful review before use.

Review before installing. Use only on a trusted local machine, replace and rotate any bundled provider keys, avoid valuable API keys until credential storage and local API protections are fixed, and assume microphone audio plus chat text may be sent to NuwaAI and DashScope/Qwen.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (12)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 87% confidence
Finding: The skill documentation describes capabilities that require network access and shell execution, yet no explicit permissions are declared. That creates a transparency and consent gap: a user or platform may approve the skill without understanding it invokes external APIs and local tooling like FFmpeg/subprocesses. In this context, hidden network and shell capability is more dangerous because the skill also asks for API keys and handles audio/microphone-related workflows.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 95% confidence
Finding: A strong description-behavior mismatch is a real security issue because it prevents informed consent and masks the actual trust boundary of the skill. The finding indicates undeclared external LLM calls, a hardcoded API key, different audio handling than advertised, and missing implementation of the claimed NuwaAI control path; together these suggest the user may expose credentials, microphone data, or content to services and code paths they were not told about. The mismatch makes the skill substantially more dangerous because it handles sensitive inputs while presenting itself as a narrowly scoped avatar-singing tool.

Description-Behavior Mismatch

Medium

Confidence: 87% confidence
Finding: The skill metadata claims NuwaAI humanctrl integration, but the implementation actually sends chats to Alibaba DashScope/Qwen and performs only local audio playback. This mismatch can mislead users and platform reviewers about what external services receive data and what capabilities are actually being exercised, undermining informed consent and trust.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: The skill exposes unauthenticated endpoints to read, write, and use NuwaAI credentials, which is significantly broader than a singing experience and creates a credential-management surface reachable by any origin due to permissive CORS. An attacker who can reach the service can overwrite configuration, trigger token issuance, or abuse the stored API key, leading to account misuse and unauthorized access to the connected avatar service.

Context-Inappropriate Capability

Medium

Confidence: 79% confidence
Finding: The /api/song/pcm endpoint allows clients to trigger server-side ffmpeg execution on arbitrary local filenames from JSON input. Although this appears intended for audio processing, it expands the skill into a general local file processing and subprocess-execution capability, which can be abused for denial of service or to process unintended files if sensitive media exists on disk.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The skill asks users for API keys and relies on microphone/ASR, but it does not clearly warn users that credentials and audio may be transmitted to external services. This is a genuine privacy and security issue because users may unknowingly expose secrets, voice data, and account-linked identifiers to third parties. The context increases sensitivity since the skill collects multiple high-value inputs at once: API credentials, avatar identifiers, and live audio.

Missing User Warnings

High

Confidence: 99% confidence
Finding: A live API key is hardcoded directly in source code, making credential theft likely through source exposure, logs, repository leaks, or artifact sharing. An attacker who obtains the key can abuse the external API, incur costs, access service-linked data, or cause account suspension.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The code forwards the full conversation history to an external API provider without any visible notice, consent flow, or data-minimization controls. Users may reveal personal or sensitive content during karaoke/chat interactions, and undisclosed third-party transmission creates privacy and compliance risk.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The page captures microphone audio with getUserMedia() and immediately forwards processed PCM data to a remote NuwaAI WebSocket service, but the UI only says the microphone is enabled and does not clearly disclose that live voice data is transmitted off-device. In a singing/duet skill this behavior is expected functionally, but lack of explicit transmission notice and consent increases privacy risk because users may not realize their speech and background audio are being sent to a third party.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The setup flow collects a user API key and posts it to /api/config for storage without clearly explaining where it will be stored, how long it will persist, or who can access it. API keys are sensitive credentials, so opaque handling can lead to accidental exposure, reuse across sessions, or insecure backend storage practices.

Missing User Warnings

Medium

Confidence: 99% confidence
Finding: A live demo API key is hardcoded directly in source and then used to request tokens from the NuwaAI service. Anyone with code access or access to the running endpoint can extract or abuse this credential, potentially consuming paid resources, impersonating the demo account, or pivoting into associated assets.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The skill persists API credentials to a local JSON file without encryption, access control, or user warning. If the host is shared, backed up insecurely, or the working directory is exposed, the stored secret can be recovered and reused to access the external avatar service.

VirusTotal

67/67 vendors flagged this skill as clean.

View on VirusTotal