fun-voice-type

Security checks across malware telemetry and agentic risk

Overview

This is a disclosed cloud voice-typing utility, but it needs sensitive microphone, keyboard-listening, and text-injection permissions that users should grant only if they trust the publisher.

Install only if you trust the publisher and are comfortable granting microphone, Input Monitoring, and Accessibility permissions to the terminal or app running it. Avoid dictating secrets or sensitive conversations unless you accept DashScope/Qwen cloud processing, verify the cursor target before speaking, keep the API key protected, and quit the tray app when not using voice typing.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
  • Rogue AgentSelf-Modification, Session Persistence
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Findings (4)

Vague Triggers

Medium
Confidence
85% confidence
Finding
The activation examples include broad, everyday phrases such as '帮我记录这段话' and '我不方便打字', which can match normal conversation and trigger the skill unintentionally. For a skill that requests microphone, input monitoring, and accessibility privileges, accidental activation increases the risk of unintended recording, transcription, and text injection into the active application.

Missing User Warnings

Medium
Confidence
95% confidence
Finding
The plugin captures microphone audio and sends it to DashScope's remote ASR service, and may later send recognized text to a remote LLM for translation, but the code shows no explicit user-facing disclosure, consent flow, or privacy notice about this network transmission. In a tool that activates from a global hotkey and records whatever the user is saying, that omission creates a real privacy risk because sensitive spoken data can be transmitted off-device without informed consent.

Missing User Warnings

Medium
Confidence
97% confidence
Finding
The callback automatically injects recognized or translated text into whatever application currently has focus using simulated keyboard input, with no warning or confirmation step. This is risky because misrecognition, prompt-influenced translation output, or focus changes can cause unintended text insertion into sensitive targets such as terminals, admin consoles, chats, or password-adjacent fields.

Session Persistence

Medium
Category
Rogue Agent
Content
## 安装与运行
运行脚本,fun-voice-type将显示为Mac菜单栏右上角的小图标:
```bash
nohup python fun-voice-type.py > /dev/null 2>&1
```
此时长按**右Option**即可实现语音输入功能。
Confidence
78% confidence
Finding
nohup

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal