Xiaozhi Claw

Security checks across malware telemetry and agentic risk

Overview

This voice skill appears purpose-aligned, but it should be reviewed because live audio and text may be sent to a third-party cloud provider without a clear privacy or consent notice.

Install only if you are comfortable with microphone audio, transcripts, and generated speech content being processed by Volcengine Doubao or the configured provider. Before production use, require a clear privacy notice, explicit user/admin opt-in, credential handling guidance, and a way to disable or replace remote processing if sensitive conversations may be captured.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
  • Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Findings (4)

Missing User Warnings

Medium
Confidence
89% confidence
Finding
The README promotes real-time voice interaction and Volcengine Doubao STT/TTS but does not clearly disclose that user audio and derived transcripts may be sent to a third-party cloud provider. In a voice-device integration, this omission can lead operators to deploy the skill without informed consent, privacy notice updates, or appropriate data-handling controls, increasing privacy and compliance risk.

Missing User Warnings

Medium
Confidence
95% confidence
Finding
The skill processes microphone audio and transcribed text through an external provider (Volcengine Doubao), but the description does not present this as a clear privacy warning to users. This is dangerous because highly sensitive voice content may be transmitted off-device and retained or processed by a third party without informed consent, creating privacy, compliance, and data-handling risks.

Missing User Warnings

Medium
Confidence
93% confidence
Finding
The service sends user-provided text and audio to an external third-party API, which is a real privacy and data-governance concern because speech data may contain sensitive content. In this skill's context, cloud STT/TTS is expected functionality, but the file provides no disclosure, consent flow, or indication of what data leaves the device, making the behavior riskier for end users.

Missing User Warnings

Medium
Confidence
91% confidence
Finding
The code sends user-provided text to an external TTS service and recorded device audio to an external STT service, but this file shows no consent, notice, or configuration gate before exporting potentially sensitive voice content off-device. In a voice-assistant integration, audio may contain private conversations, credentials, or other personal data, so undisclosed third-party transmission creates a real privacy and compliance risk even if the behavior is functionally intended.

VirusTotal

63/63 vendors flagged this skill as clean.

View on VirusTotal