Fun-ASR speech recognition

Security checks across malware telemetry and agentic risk

Overview

This is a straightforward audio transcription skill that uses Alibaba Cloud DashScope as described, with normal privacy considerations for uploaded audio.

Install only if you are comfortable sending chosen audio files to Alibaba Cloud DashScope and using your own DashScope API key. Avoid sensitive recordings unless you have permission to process them with that provider, and consider using a dedicated API key and isolated Python environment.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (4)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 88% confidence
Finding: The skill instructs users to set and rely on an environment variable for an external API key, but the skill metadata shown does not declare corresponding permissions or capability requirements. This creates a transparency and governance gap: the skill can access sensitive credentials and send user data to a third-party service without clearly declaring that behavior to the hosting platform or user.

Vague Triggers

Medium

Confidence: 90% confidence
Finding: The activation rule is broad enough to trigger on essentially any user-provided audio file, which can cause the skill to process content without sufficiently specific user intent. In this context, that is risky because audio may contain sensitive personal, financial, or confidential information that would then be sent to an external transcription provider.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The description says the skill performs speech-to-text using the DashScope API, but it does not clearly warn users that their audio content will be transmitted to a third-party service. This is a meaningful privacy issue because users may assume local processing and unknowingly expose sensitive voice data, conversations, or regulated information to an external provider.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The script sends the contents of a user-supplied audio file to Alibaba Cloud's DashScope ASR service over a websocket, but it does not clearly disclose that transcription is performed remotely or that potentially sensitive audio leaves the local machine. In a transcription skill, remote processing is expected, but lack of explicit notice and consent can expose confidential voice data or regulated content without the user's informed awareness.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal