Speech to Text

Security checks across malware telemetry and agentic risk

Overview

This skill openly sends a user-selected audio file to a public Hugging Face Whisper service for transcription, which matches its stated purpose but has privacy implications.

Install only if you are comfortable sending selected audio files and filenames to a public Hugging Face/Gradio service. Avoid confidential meetings, regulated data, or private recordings unless you explicitly accept third-party processing or configure a trusted private endpoint.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
  • Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Findings (2)

Lp3

Medium
Category
MCP Least Privilege
Confidence
95% confidence
Finding
The skill directs the agent to read local files, access environment configuration, and send audio to a public network endpoint, but it declares no permissions. This creates a transparency and policy-enforcement gap: users and hosting platforms may not realize the skill exfiltrates local audio to a third party, increasing the risk of unauthorized data disclosure.

Missing User Warnings

Medium
Confidence
97% confidence
Finding
This script unconditionally uploads the full local audio file to a public Hugging Face Space, which sends user-provided content and associated metadata off-device to a third-party service. In a speech-to-text skill, audio often contains sensitive personal, business, or regulated information, so the lack of an explicit user-facing warning or consent step creates a real privacy and data-handling risk.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal