Xiaozhi Claw

WarnAudited by ClawScan on May 10, 2026.

Overview

The skill mostly matches its stated XiaoZhi voice-bridge purpose, but it opens an unauthenticated WebSocket channel that can send messages directly to your OpenClaw agent.

Only install this if you understand that it starts a WebSocket server for hardware voice input. Use it on a trusted network, do not expose port 8080 broadly, protect your Doubao credentials, and prefer a version that adds authenticated pairing, rate limits, and clear credential/capability declarations.

Findings (5)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

What this means

Anyone who can reach the listening port could send prompts or commands to the user's agent and potentially use whatever capabilities that agent has.

Why it was flagged

Network WebSocket messages are converted directly into OpenClaw agent messages, and the provided handler does not show an authentication, pairing, or approval check before calling the agent.

Skill content
wss = new WebSocketServer({ port }); ... await handleXiaozhiMessage(deviceId, message, ctx); ... ctx.agent.processMessage({ from: deviceId, text: userText, channel: "xiaozhi" })
Recommendation

Require a pairing secret or authenticated device token before forwarding messages to the agent, bind to localhost or a trusted interface by default, and document the exposure clearly.

What this means

A nearby or otherwise reachable client may impersonate a XiaoZhi device, inject voice/text input, or receive responses intended for the hardware.

Why it was flagged

The device identity is derived from the request URL and then trusted as the session identity; the code shown does not verify device identity, origin, or transport security.

Skill content
const deviceId = req.url?.split("?")[0].slice(1) || "unknown"; ... clients.set(deviceId, { ws, audioStream, audioBuffer: [], isListening: false, doubaoService })
Recommendation

Verify device identity with a shared secret or signed pairing flow, reject unknown origins/paths, and prefer encrypted transport where feasible.

What this means

A bad or malfunctioning client could consume memory, keep the service busy, or drive unexpected third-party API usage and cost.

Why it was flagged

Audio frames from a connected client are buffered without a visible size, time, client-count, or rate limit, then can trigger downstream STT processing.

Skill content
session.audioBuffer.push(data); ... const fullAudio = Buffer.concat(session.audioBuffer); ... userText = await session.doubaoService.speechToText(wavData, AUDIO_CONFIG.sampleRate)
Recommendation

Add maximum payload sizes, recording-duration limits, client limits, rate limits, timeouts, and explicit error handling for oversized sessions.

What this means

The Doubao token may incur provider usage and billing, especially if the WebSocket endpoint is exposed to untrusted clients.

Why it was flagged

The skill uses a Doubao API token from the environment to perform STT/TTS. This is expected for the advertised provider integration, but it is sensitive account authority.

Skill content
appId: process.env.DOUBAO_APP_ID || '', accessToken: process.env.DOUBAO_ACCESS_TOKEN || '' ... 'Authorization': `Bearer ${this.config.accessToken}`
Recommendation

Use a least-privileged token if available, set provider quotas, protect the .env file, and ensure the registry metadata declares the required credentials.

What this means

A future install could resolve dependency versions that differ from the reviewed environment.

Why it was flagged

The plugin depends on external npm packages with range-based versions. These dependencies are purpose-aligned, but the provided artifacts do not include a lockfile.

Skill content
"dependencies": { "dotenv": "^16.3.1", "ws": "^8.14.2", "opusscript": "^0.1.1" }
Recommendation

Install from a trusted source and prefer a reviewed lockfile or pinned dependency versions for deployment.