senseaudio-conversation-rehearsal

Security checks across malware telemetry and agentic risk

Overview

This rehearsal skill is not proven malicious, but it asks for sensitive voice, transcript, messaging, and browser-session credential access with weak consent boundaries.

Install only if you trust the publisher and are comfortable with external ASR/TTS, Feishu delivery, voice cloning, local transcript/audio persistence, and credential use. Prefer a local-only or explicitly confirmed mode, avoid Chrome-session token extraction, use scoped API keys through a secret manager, and do not rehearse confidential HR, legal, or business matters unless the data handling and retention are acceptable.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (26)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 95% confidence
Finding: The skill advertises significant capabilities including environment access, file reads/writes, network access, and shell execution without declaring permissions or presenting a least-privilege boundary. That makes it easier for operators and users to underestimate what the skill can access, and it increases the chance of unintended credential, file, or command execution exposure.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 98% confidence
Finding: The documented behavior materially exceeds the stated purpose of a rehearsal tool by including voice-clone creation, browser-session/token access, credential substitution, and Feishu message delivery. This mismatch is dangerous because users may consent to a coaching workflow without realizing the skill can access local credentials, automate external services, and transmit content off-platform.

Context-Inappropriate Capability

High

Confidence: 99% confidence
Finding: The skill describes resolving login tokens into real API keys from a local credentials file and obtaining platform tokens from a logged-in Chrome session. Accessing local credential stores and browser session state is highly sensitive and not necessary for ordinary conversation rehearsal, creating a clear path to credential misuse or unauthorized API access.

Context-Inappropriate Capability

High

Confidence: 94% confidence
Finding: The skill supports creating cloned voices through workspace automation and even falls back to a logged-in browser session, which expands a rehearsal tool into identity/voice provisioning. Voice cloning is a sensitive capability with impersonation and consent risks, especially when the workflow encourages automation rather than strict manual verification.

Context-Inappropriate Capability

Medium

Confidence: 78% confidence
Finding: The script includes an optional capability to send generated rehearsal audio and related session data to Feishu, which expands data exposure beyond the stated core purpose of local rehearsal and debriefing. In a conversation-rehearsal context, the content may contain sensitive workplace discussions, so an unrelated outbound-sharing feature materially increases privacy and exfiltration risk.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: The script dynamically locates and loads a helper module from another skill, creating a cross-skill trust boundary bypass. If an attacker can place or alter the referenced helper file anywhere in the searched parent tree, this script will execute that code, enabling arbitrary code execution or unauthorized message-sending behavior under this skill's context.

Description-Behavior Mismatch

Medium

Confidence: 84% confidence
Finding: This script reads rehearsal history, selects counterpart turns, and prepares them for delivery to Feishu, which exports sensitive rehearsal content and metadata outside the local workflow. In a conversation-rehearsal context, transcripts and synthesized counterpart audio may contain confidential performance, workplace, or stakeholder information, so external transmission materially increases privacy and data-handling risk.

Context-Inappropriate Capability

Medium

Confidence: 82% confidence
Finding: This script exposes capabilities to create, enumerate, and reselect cloned voices and voice slots, which materially expands beyond a narrow conversation-rehearsal function and enables direct voice-cloning account operations. In this skill context, that is more dangerous because the feature handles biometric voice data and authenticated clone-management actions, increasing the risk of unauthorized cloning or misuse if invoked without strong consent and scope controls.

Context-Inappropriate Capability

High

Confidence: 98% confidence
Finding: This script automates Google Chrome, inspects tabs for senseaudio.cn, and extracts authentication tokens from localStorage/sessionStorage. That is credential harvesting behavior, and it is not justified by a conversation-rehearsal skill; the mismatch in context makes it especially suspicious and dangerous.

Intent-Code Divergence

Medium

Confidence: 91% confidence
Finding: The CLI description understates the tool's behavior by presenting it as resolving a token from env or Chrome, while the implementation actually opens or inspects a browser tab and extracts credentials from web storage. This deceptive or incomplete disclosure prevents informed consent and increases the likelihood of unauthorized credential access.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The skill says rehearsal mode should default to sending counterpart turns to Feishu audio without requiring the user to repeat the request, but it does not clearly warn that rehearsal content may be transmitted to an external messaging platform. In a high-pressure conversation rehearsal context, the content may include sensitive personnel, performance, or business information, so silent external transmission raises privacy and data-leakage risk.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The skill documents replacing an injected token with a real API key from a local file but does not clearly warn the user that local credential material will be accessed. This undermines informed consent and increases the chance that users unknowingly authorize sensitive secret retrieval in a context that appears to be only about rehearsal audio.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The skill describes resolving a platform token from a logged-in Chrome tab when Apple Events JavaScript is enabled, without a strong privacy warning or boundary explanation. Browser-session access can expose unrelated authenticated state and is highly sensitive, so even documented use without clear warning is risky.

Vague Triggers

Medium

Confidence: 91% confidence
Finding: The default prompt uses very broad activation language ('Use $senseaudio-conversation-rehearsal to design and run...') without clear gating conditions, which increases the chance the skill is invoked in contexts the user did not explicitly authorize. In this skill, that broad trigger is compounded by automatic transition into voice behavior and message delivery, making unintended activation more operationally risky than a purely text-only assistant.

Missing User Warnings

Medium

Confidence: 86% confidence
Finding: The design explicitly has the user answer by voice and then transcribes that speech, but the document does not state any user-facing warning, consent flow, retention policy, or handling constraints for captured audio and transcripts. In a high-pressure conversation rehearsal context, users may disclose sensitive workplace, HR, performance, or personal information, so silent collection or unclear privacy handling creates a meaningful privacy and compliance risk.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The script persists service summaries and clone-related artifacts to disk, including clone metadata, session outputs, notes, and potentially identifiers tied to voice-cloning operations. In this skill context, those files may contain sensitive conversational rehearsal data, voice-clone metadata, or operational traces that create privacy and misuse risks if stored in shared or insecure locations.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: This code is designed to use platform tokens/API keys and can route user audio, browser-session-derived access, and clone operations to external services, but there is no user-facing warning or consent enforcement in this file. In a voice rehearsal and cloning workflow, silent transmission of audio and browser-session-backed actions materially increases privacy and account-misuse risk.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: The script persists transcripts, debriefs, history, and generated audio paths to disk without any warning, consent, retention control, or access restrictions. Because rehearsal sessions are likely to contain sensitive professional communications, storing these artifacts by default can create unintended disclosure risk to other local users, backup systems, or downstream tools.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The script sends user audio to ASR and sends generated counterpart text to TTS services using an external API key, but it provides no user-facing notice that conversation content leaves the local environment. In this skill's context, the rehearsed material may include confidential employment, stakeholder, or performance-review details, making silent third-party transmission particularly sensitive.

Missing User Warnings

Medium

Confidence: 87% confidence
Finding: The code uploads audio and sends related chat metadata to Feishu without any explicit in-band warning or confirmation in this file. Because the skill handles high-pressure conversation rehearsal, the content is likely sensitive; silent network transmission to a third-party messaging platform raises privacy, compliance, and unintended disclosure risks.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The script uploads user audio and receives transcripts from a remote third-party ASR endpoint, but provides no user-facing disclosure, consent checkpoint, or data-handling warning in the script. In this skill context, the audio may contain sensitive workplace, personnel, or strategy discussions, which makes silent transmission to an external service materially privacy-relevant.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The create_clone path uploads a local audio file to a remote voice-cloning API with no user-facing disclosure, confirmation, or indication that biometric voice data is being transmitted off device. In the context of a rehearsal skill using authorized cloned voices, silent upload is particularly risky because users may not understand they are creating a reusable voice clone rather than performing transient speech processing.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The browser-session path uploads local audio into an authenticated browser context using ambient session credentials, allowing clone creation without requiring an explicit API token or separate re-authentication. That is more dangerous in this skill because it can leverage an already logged-in account to perform sensitive voice-cloning actions with reduced user awareness and weaker accountability.

Missing User Warnings

Medium

Confidence: 84% confidence
Finding: The script sends counterpart_text to an external TTS provider without any disclosure, confirmation, or technical guard in this file. In a conversation-rehearsal skill, that text may contain sensitive workplace, HR, legal, or personal content, so silent third-party transmission creates a real privacy and compliance risk.

Missing User Warnings

High

Confidence: 99% confidence
Finding: The script prints the resolved platform token to stdout in JSON, which can expose credentials to terminal history, logs, CI captures, shell pipelines, or other local observers. Because the token may be sourced silently from Chrome storage, this creates a direct credential exfiltration path with no warning or confirmation.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal