Xiaozhi Claw

Security checks across malware telemetry and agentic risk

Overview

This voice skill appears purpose-aligned, but it should be reviewed because live audio and text may be sent to a third-party cloud provider without a clear privacy or consent notice.

Install only if you are comfortable with microphone audio, transcripts, and generated speech content being processed by Volcengine Doubao or the configured provider. Before production use, require a clear privacy notice, explicit user/admin opt-in, credential handling guidance, and a way to disable or replace remote processing if sensitive conversations may be captured.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep

Findings (4)

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: The README promotes real-time voice interaction and Volcengine Doubao STT/TTS but does not clearly disclose that user audio and derived transcripts may be sent to a third-party cloud provider. In a voice-device integration, this omission can lead operators to deploy the skill without informed consent, privacy notice updates, or appropriate data-handling controls, increasing privacy and compliance risk.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The skill processes microphone audio and transcribed text through an external provider (Volcengine Doubao), but the description does not present this as a clear privacy warning to users. This is dangerous because highly sensitive voice content may be transmitted off-device and retained or processed by a third party without informed consent, creating privacy, compliance, and data-handling risks.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The service sends user-provided text and audio to an external third-party API, which is a real privacy and data-governance concern because speech data may contain sensitive content. In this skill's context, cloud STT/TTS is expected functionality, but the file provides no disclosure, consent flow, or indication of what data leaves the device, making the behavior riskier for end users.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The code sends user-provided text to an external TTS service and recorded device audio to an external STT service, but this file shows no consent, notice, or configuration gate before exporting potentially sensitive voice content off-device. In a voice-assistant integration, audio may contain private conversations, credentials, or other personal data, so undisclosed third-party transmission creates a real privacy and compliance risk even if the behavior is functionally intended.

VirusTotal

63/63 vendors flagged this skill as clean.

View on VirusTotal