Xiaozhi Claw
WarnAudited by ClawScan on May 10, 2026.
Overview
The skill mostly matches its stated XiaoZhi voice-bridge purpose, but it opens an unauthenticated WebSocket channel that can send messages directly to your OpenClaw agent.
Only install this if you understand that it starts a WebSocket server for hardware voice input. Use it on a trusted network, do not expose port 8080 broadly, protect your Doubao credentials, and prefer a version that adds authenticated pairing, rate limits, and clear credential/capability declarations.
Findings (5)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
Anyone who can reach the listening port could send prompts or commands to the user's agent and potentially use whatever capabilities that agent has.
Network WebSocket messages are converted directly into OpenClaw agent messages, and the provided handler does not show an authentication, pairing, or approval check before calling the agent.
wss = new WebSocketServer({ port }); ... await handleXiaozhiMessage(deviceId, message, ctx); ... ctx.agent.processMessage({ from: deviceId, text: userText, channel: "xiaozhi" })Require a pairing secret or authenticated device token before forwarding messages to the agent, bind to localhost or a trusted interface by default, and document the exposure clearly.
A nearby or otherwise reachable client may impersonate a XiaoZhi device, inject voice/text input, or receive responses intended for the hardware.
The device identity is derived from the request URL and then trusted as the session identity; the code shown does not verify device identity, origin, or transport security.
const deviceId = req.url?.split("?")[0].slice(1) || "unknown"; ... clients.set(deviceId, { ws, audioStream, audioBuffer: [], isListening: false, doubaoService })Verify device identity with a shared secret or signed pairing flow, reject unknown origins/paths, and prefer encrypted transport where feasible.
A bad or malfunctioning client could consume memory, keep the service busy, or drive unexpected third-party API usage and cost.
Audio frames from a connected client are buffered without a visible size, time, client-count, or rate limit, then can trigger downstream STT processing.
session.audioBuffer.push(data); ... const fullAudio = Buffer.concat(session.audioBuffer); ... userText = await session.doubaoService.speechToText(wavData, AUDIO_CONFIG.sampleRate)
Add maximum payload sizes, recording-duration limits, client limits, rate limits, timeouts, and explicit error handling for oversized sessions.
The Doubao token may incur provider usage and billing, especially if the WebSocket endpoint is exposed to untrusted clients.
The skill uses a Doubao API token from the environment to perform STT/TTS. This is expected for the advertised provider integration, but it is sensitive account authority.
appId: process.env.DOUBAO_APP_ID || '', accessToken: process.env.DOUBAO_ACCESS_TOKEN || '' ... 'Authorization': `Bearer ${this.config.accessToken}`Use a least-privileged token if available, set provider quotas, protect the .env file, and ensure the registry metadata declares the required credentials.
A future install could resolve dependency versions that differ from the reviewed environment.
The plugin depends on external npm packages with range-based versions. These dependencies are purpose-aligned, but the provided artifacts do not include a lockfile.
"dependencies": { "dotenv": "^16.3.1", "ws": "^8.14.2", "opusscript": "^0.1.1" }Install from a trusted source and prefer a reviewed lockfile or pinned dependency versions for deployment.
