Speech to Text

Security checks across malware telemetry and agentic risk

Overview

This skill openly sends a user-selected audio file to a public Hugging Face Whisper service for transcription, which matches its stated purpose but has privacy implications.

Install only if you are comfortable sending selected audio files and filenames to a public Hugging Face/Gradio service. Avoid confidential meetings, regulated data, or private recordings unless you explicitly accept third-party processing or configure a trusted private endpoint.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code

Findings (2)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 95% confidence
Finding: The skill directs the agent to read local files, access environment configuration, and send audio to a public network endpoint, but it declares no permissions. This creates a transparency and policy-enforcement gap: users and hosting platforms may not realize the skill exfiltrates local audio to a third party, increasing the risk of unauthorized data disclosure.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: This script unconditionally uploads the full local audio file to a public Hugging Face Space, which sends user-provided content and associated metadata off-device to a third-party service. In a speech-to-text skill, audio often contains sensitive personal, business, or regulated information, so the lack of an explicit user-facing warning or consent step creates a real privacy and data-handling risk.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal