Bidirectional Voice Chat System

Security checks across malware telemetry and agentic risk

Overview

This voice skill is mostly aligned with voice chat, but it asks users to expose audio services publicly and automatically handle spoken content in ways that need careful review.

Install only if you are comfortable with microphone capture, generated voice files, a local web server, and optional public tunnel exposure. Keep it in local mode unless you need public access, avoid sending private speech through the hotkey workflow without checking the config, and disable or remove any habits.json interaction tracking you do not want.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (5)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 93% confidence
Finding: The skill instructs users to run shell commands, read and write configuration files under the user's home directory, and launch local services, yet it declares no permissions or trust boundaries. That mismatch can cause users or orchestration systems to grant the skill more access than expected and obscures the real security posture of the skill.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: The hotkey workflow states that captured speech may be automatically transcribed and copied to the clipboard or directly sent to AI, which expands data handling beyond simple voice chat. Clipboard injection and silent forwarding of transcribed speech can expose sensitive spoken content to other applications, logs, or remote services without clear user consent.

Context-Inappropriate Capability

Medium

Confidence: 88% confidence
Finding: The AGENTS.md guidance recommends updating connection metrics and recording voice interaction frequency to habits.json, which introduces behavioral profiling unrelated to core speech bridging. Even if local-only, this creates unnecessary collection of user interaction metadata and can reveal personal usage patterns over time.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The skill strongly promotes public sharing through Cloudflare Tunnel, Ngrok, and LocalTunnel, but does not foreground the privacy and exposure risks of making generated voice files and a local service reachable from the internet. Users may unknowingly publish sensitive audio, transcripts, or predictable file URLs to unauthenticated endpoints.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The automatic record-transcribe-send workflow does not warn users that captured speech may be copied to the clipboard or forwarded to an AI system, which can disclose private spoken information unexpectedly. Because this is triggered by a hotkey and release action, users may not realize when transmission or persistence occurs.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal