Kesha Voice Kit

Security checks across malware telemetry and agentic risk

Overview

This skill is a coherent local voice transcription and speech tool, with disclosed privacy implications when transcripts are echoed into OpenClaw chat.

Install this only if you want a local voice toolkit and are comfortable with a global Bun package plus downloaded model assets. If voice notes may contain sensitive speech, consider changing OpenClaw's transcript echo setting before use, because the documented setup copies recognized speech into chat history/context.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
  • Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Findings (2)

Vague Triggers

Medium
Confidence
86% confidence
Finding
The trigger keyword list includes broad, common terms such as "say," "privacy," and generic audio/file references, which can cause the skill to be invoked outside its intended context. In an agent setting, unintended invocation can route unrelated user content or files into transcription/TTS workflows, increasing the chance of privacy exposure, incorrect actions, or unnecessary command execution.

Ssd 3

Medium
Confidence
95% confidence
Finding
The documented OpenClaw configuration enables `echoTranscript: true`, which reflects transcribed audio content back into chat by default. Because transcripts may contain sensitive user speech, secrets, or third-party audio content, this creates a direct confidentiality risk and can also surface prompt-injection content extracted from audio into the conversational context.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal