九马免费声音克隆

Security checks across malware telemetry and agentic risk

Overview

This skill does what it says: it sends selected text and optional voice samples to Jiuma for TTS or voice cloning, with disclosed but sensitive local API-key storage.

Install only if you are comfortable sending chosen text and any reference voice audio to Jiuma. Use only voices you own or have permission to clone. If you log in, treat the saved Jiuma API key as a secret, avoid shared workspaces, and delete or rotate it when no longer needed.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (13)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 88% confidence
Finding: The skill documentation describes behavior that reads and writes local files and makes network requests, yet no permissions are declared. This creates a transparency and policy-enforcement gap: users or the hosting platform may authorize the skill assuming lower privilege than it actually needs, increasing the chance of unintended data access or exfiltration.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 95% confidence
Finding: The skill is presented as a TTS/voice-cloning tool, but it also initiates account login, polls login state with an access token, and stores a secret/API key locally. That is a significant expansion of trust and capability beyond the declared purpose, and it can expose users to credential harvesting, silent account linkage, or insecure secret persistence if they do not realize authentication side effects are part of using the skill.

Description-Behavior Mismatch

High

Confidence: 91% confidence
Finding: The documentation describes a separate authorization/login utility that acquires and stores API keys, which is broader and more sensitive than the declared voice-cloning/TTS purpose. This mismatch increases the risk that the skill performs credential collection or account-linking behavior users and reviewers would not expect, enabling secret handling outside the manifest's stated scope.

Context-Inappropriate Capability

Medium

Confidence: 86% confidence
Finding: Persistently saving API keys to the local filesystem is a sensitive capability and is not inherently required by a voice-cloning skill alone. If implemented as documented, this can expose credentials to other local processes, backups, shared environments, or accidental disclosure, especially when storage location and protections are unspecified.

Description-Behavior Mismatch

Medium

Confidence: 85% confidence
Finding: The file implements login-token handling and API-key acquisition/persistence, which expands the skill from simple TTS/voice cloning into credential management. In an agent-skill context, collecting and exchanging login artifacts is security-relevant because it can grant ongoing access to a third-party account and is broader than the stated user-facing capability.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: The code saves the returned secret_key locally via save_jiuma_api_key() without showing how it is protected, rotated, or scoped. Persisting credentials creates a durable secret on disk or in local config, increasing the risk of theft, reuse, or unintended access beyond a single session.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: The documentation tells users that an API key will be automatically saved locally but provides no warning that the key is a sensitive credential or guidance on secure handling. This can lead to unsafe deployment practices, such as storing secrets in plaintext, committing them to repos, or leaving them accessible on multi-user systems.

Vague Triggers

Medium

Confidence: 74% confidence
Finding: The activation description is broad enough to trigger on many generic speech or voice-related requests, which can cause the skill to be invoked in contexts where users did not intend to send text or audio to a third-party service. In a voice-cloning skill, overbroad activation is more dangerous because it may lead to accidental upload of sensitive audio samples or unintended synthesis requests.

Missing User Warnings

High

Confidence: 96% confidence
Finding: The skill supports uploading custom reference audio to a third-party API but does not prominently warn about privacy, consent, or ownership requirements. Because voice samples are biometric and may contain personal or sensitive information, lack of explicit disclosure increases the risk of unauthorized cloning, privacy violations, and misuse of another person's voice.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The skill uploads a user-provided reference audio file to a third-party service for processing, but the code does not present any explicit user-facing consent notice or warning that the local file will be transmitted off-device. Because voice samples are sensitive biometric/personal data, silent transmission increases privacy and compliance risk, especially if users assume processing is local.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: After check_login_status() succeeds, the skill immediately persists the secret_key and only then informs the user that the API key has been saved. In an agent environment, silently storing credentials without explicit confirmation or opt-in reduces user control and can surprise users about long-term access retention.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The code stores the Jiuma API key in plaintext on disk under a predictable path with no permission hardening, encryption, or user disclosure. On multi-user systems, shared workspaces, backups, or logs, this can expose a reusable secret that allows unauthorized use of the external API and possible billing or account abuse.

Missing User Warnings

Medium

Confidence: 82% confidence
Finding: This helper sends arbitrary data and uploaded files to a remote API via requests.post, and in a voice-cloning skill those files may contain sensitive biometric voice samples or user content. Even if remote transmission is functionally required, the lack of explicit disclosure/consent and data-minimization controls creates a real privacy and security risk because users may not realize their audio and metadata leave the local environment.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal