Back to skill
Skillv0.1.0

ClawScan security

Voice Listener · ClawHub's context-aware review of the artifact, metadata, and declared behavior.

Scanner verdict

BenignMar 5, 2026, 12:24 PM
Verdict
benign
Confidence
high
Model
gpt-5-mini
Summary
The package implements exactly what it claims: a local, Baidu-based continuous voice listener with a wake word that pastes recognized text into the active cursor — no unrelated credentials, remote endpoints, or incoherent behavior were found.
Guidance
This skill appears to do exactly what it claims: listen to your microphone, call Baidu's speech APIs using credentials you provide in baidu_config.json, and paste recognized text into wherever the cursor is. Before installing or running it: 1) Only provide your Baidu APP_ID/API_KEY/SECRET_KEY if you trust the code and (preferably) run it locally in a controlled environment. 2) Be aware it will simulate Ctrl+V keystrokes — do not run it while sensitive forms or password fields are focused. 3) The repository lists dependencies (sounddevice, numpy, keyboard, pyperclip, requests) but does not auto-install them; install them from trusted sources. 4) The script writes temporary WAV files (deleted), and sends audio to Baidu's official endpoints (token and vop.server_api). 5) If you need higher assurance, review the full voice_input_baidu_smart.py file yourself and run it in a sandboxed account or VM first. If you want me to, I can list the exact commands to install the dependencies and run the skill in an isolated environment.

Review Dimensions

Purpose & Capability
okThe skill's name/description (Baidu speech recognition + wake word + auto-input) matches the provided code and docs. The code reads a local baidu_config.json (APP_ID/API_KEY/SECRET_KEY) and calls Baidu token and speech endpoints; it uses sounddevice for audio, keyboard for keystroke simulation, pyperclip for clipboard — all appropriate for the stated purpose. One minor oddity: a WORKSPACE path is defined (C:\Users\11666\.openclaw\workspace) but not used for network credentials; this appears to be an environment artifact from the author rather than required functionality.
Instruction Scope
noteSKILL.md and the scripts only instruct running local Python scripts or a .bat and editing baidu_config.json. The runtime instructions cause continuous microphone capture and automatic pasting of recognized text into the current cursor position — this is coherent with the skill but has obvious privacy/usability implications (it will paste into whatever window has focus). The instructions do not attempt to read unrelated config or secret stores or POST data to unexpected endpoints; network calls are limited to Baidu's documented APIs.
Install Mechanism
okThere is no automated install step and no downloads from arbitrary URLs; the repository is instruction + code files. No install spec means nothing will be pulled silently during install. Users must manually install dependencies (sounddevice, numpy, keyboard, pyperclip, requests) — the package does not declare them as required env vars, but package.json and READMEs list them as tech stack.
Credentials
okThe skill requests Baidu API credentials via a local baidu_config.json (APP_ID/API_KEY/SECRET_KEY), which is exactly what a Baidu REST-based recognizer needs. It does not request unrelated cloud keys, tokens, or system credentials. Note: the skill does require access to the microphone and the ability to synthesize keyboard events (keyboard module), which are legitimate for its purpose but require user privileges.
Persistence & Privilege
okalways is false and the skill does not modify other skills or system-wide agent settings. It runs as a normal user process and creates temporary WAV files (deleted promptly). The skill will run continuously while active and simulate keystrokes — this is necessary for its function but increases blast radius if misused (e.g., if left running when entering passwords).