Speech to text

Security checks across malware telemetry and agentic risk

Overview

This is a local speech-to-text skill that processes audio files with Whisper, with a disclosed but privacy-sensitive watch mode.

Install in a dedicated Python environment and use a dedicated private inbound folder. Treat generated transcripts and logs as sensitive, keep backups if you do not want original audio moved, and run watch mode only when you want ongoing automatic processing.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
  • Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Findings (1)

Missing User Warnings

Low
Confidence
90% confidence
Finding
The skill exposes a continuous folder monitoring capability (`stt_watch`) but the description does not clearly warn users that it can keep watching an inbound directory and automatically process new audio files. This can lead to unintended ingestion of sensitive voice data or surprise background processing, especially in shared or synced folders.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal