fun-voice-type

Security checks across malware telemetry and agentic risk

Overview

This is a disclosed cloud voice-typing utility, but it needs sensitive microphone, keyboard-listening, and text-injection permissions that users should grant only if they trust the publisher.

Install only if you trust the publisher and are comfortable granting microphone, Input Monitoring, and Accessibility permissions to the terminal or app running it. Avoid dictating secrets or sensitive conversations unless you accept DashScope/Qwen cloud processing, verify the cursor target before speaking, keep the API key protected, and quit the tray app when not using voice typing.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Rogue AgentSelf-Modification, Session Persistence
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (4)

Vague Triggers

Medium

Confidence: 85% confidence
Finding: The activation examples include broad, everyday phrases such as '帮我记录这段话' and '我不方便打字', which can match normal conversation and trigger the skill unintentionally. For a skill that requests microphone, input monitoring, and accessibility privileges, accidental activation increases the risk of unintended recording, transcription, and text injection into the active application.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The plugin captures microphone audio and sends it to DashScope's remote ASR service, and may later send recognized text to a remote LLM for translation, but the code shows no explicit user-facing disclosure, consent flow, or privacy notice about this network transmission. In a tool that activates from a global hotkey and records whatever the user is saying, that omission creates a real privacy risk because sensitive spoken data can be transmitted off-device without informed consent.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The callback automatically injects recognized or translated text into whatever application currently has focus using simulated keyboard input, with no warning or confirmation step. This is risky because misrecognition, prompt-influenced translation output, or focus changes can cause unintended text insertion into sensitive targets such as terminals, admin consoles, chats, or password-adjacent fields.

Session Persistence

Medium

Category: Rogue Agent
Content: ## 安装与运行运行脚本，fun-voice-type将显示为Mac菜单栏右上角的小图标： ```bash nohup python fun-voice-type.py > /dev/null 2>&1 ``` 此时长按**右Option**即可实现语音输入功能。
Confidence: 78% confidence
Finding: nohup

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal