Back to skill
Skillv0.2.3
ClawScan security
WebChat Voice Proxy · ClawHub's context-aware review of the artifact, metadata, and declared behavior.
Scanner verdict
BenignMar 20, 2026, 2:01 AM
- Verdict
- benign
- Confidence
- high
- Model
- gpt-5-mini
- Summary
- The skill's code, scripts, and runtime instructions are coherent with its stated purpose: it modifies the Control UI, installs a user-level HTTPS/WSS proxy and startup hook, and proxies audio to a local faster-whisper service; requested changes and accesses are proportionate to that purpose.
- Guidance
- What to consider before installing: - This skill makes persistent, local changes: it copies a JS asset into your Control UI, injects a <script> tag into index.html, appends allowedOrigins to ~/.openclaw/openclaw.json, installs a user-level systemd service, and installs a gateway startup hook that re-injects the script after updates. Review these exact changes (deploy.sh, inject.sh, uninstall.sh) before running deploy.sh. - By default the proxy binds to 127.0.0.1 (local-only). Only set VOICE_HOST to a LAN IP if you intentionally want other devices on the network to reach the proxy — that will expose the gateway WebSocket and transcription endpoint to the LAN. If exposed, the proxy enforces a Bearer token matched to the gateway token (read from your openclaw.json) — but you must ensure the LAN is trusted and consider rate-limiting/monitoring. - The skill reads your gateway auth token from ~/.openclaw/openclaw.json and the client reads it from Control UI localStorage to authenticate /transcribe calls. This is necessary for proper authorization but means the skill accesses a local sensitive value — back up your config if you are concerned. - The proxy generates a self-signed cert by default (you'll see browser warnings). You can supply VOICE_CERT/VOICE_KEY to use your own certs. - The skill requires a local faster-whisper transcription service (http://127.0.0.1:18790/transcribe) to be running before deploying; the scripts validate aiohttp and prefer a venv. - The package contains an uninstall.sh which attempts to remove the service, hook, injected script, and certs; test uninstall in a safe environment if you need guarantees. - Recommended steps: inspect deploy.sh, inject.sh, https-server.py and voice-input.js yourself; run deploy.sh with defaults (localhost) first; do not set VOICE_HOST to a LAN IP unless you understand the exposure; keep backups of ~/.openclaw/openclaw.json and your Control UI index.html before installing.
- Findings
[3.2_transcribe_auth] expected: Audit noted medium risk originally because exposing /transcribe without auth would be dangerous; code and audit show /transcribe enforces Bearer token checking against the gateway token read from ~/.openclaw/openclaw.json when a gateway token exists. This auth is necessary and expected to protect the transcription endpoint when the proxy is exposed. [4.5_tls_key_permissions] expected: Audit found TLS key file permissions and reports deploy/https-server.py explicitly sets key file mode to 0600 after generation — appropriate for user-created certs. [7.1_input_validation_port] expected: VOICE_HTTPS_PORT and VOICE_HOST are validated both in deploy.sh and in https-server.py; this defends against command/shell injection and is expected for a script that interpolates these values into systemd and config files. [7.6_unvalidated_optional_envs] unexpected: VOICE_CERT, VOICE_KEY, VOICE_TRANSCRIBE_URL and VOICE_GATEWAY_WS are not validated by https-server.py. The audit considers this a low/open issue but acceptable because these are admin-set local values; if an attacker could set these env vars they would already have code execution. Admins should still review values when customizing. [hooks_execfile_and_inject] expected: The gateway hook runs inject.sh via execFileSync and the inject script copies voice-input.js and patches index.html. This behavior is required to make the UI persistent across updates; the audit confirms execFileSync is used safely (no shell interpolation) and sed uses hardcoded strings. [assets_browser_localtoken] expected: voice-input.js reads the gateway token from Control UI localStorage to attach Authorization headers to /transcribe — this is necessary to perform authenticated requests from the browser and is consistent with the proxy's auth design. It means the browser will send a locally-stored token; administrators should ensure localStorage usage is acceptable for their security posture.
Review Dimensions
- Purpose & Capability
- okThe skill claims to provide an HTTPS/WSS proxy, inject a mic UI, and forward audio to a local transcription service — and the included files (https-server.py, voice-input.js, deploy/uninstall scripts, hook) implement exactly those behaviors. Required modifications (index.html injection, copying voice-input.js, systemd user service, editing ~/.openclaw/openclaw.json allowedOrigins) are consistent with enabling a proxied Control UI and are expected for this feature.
- Instruction Scope
- noteThe runtime instructions and scripts intentionally modify gateway config (append allowedOrigins), write into the npm-installed Control UI, install a user systemd service, and create a gateway startup hook that re-injects the UI script. They also read the gateway auth token from ~/.openclaw/openclaw.json (https-server.py) and the token from Control UI localStorage (voice-input.js) to authenticate /transcribe requests. This is necessary for the feature but means the skill touches and reads local gateway configuration and Control UI files — review these changes before running deploy.sh.
- Install Mechanism
- okThere is no remote download/extract install; deploy.sh copies bundled assets into user workspace and system locations, creates a systemd user unit, and installs local hook files. No external archive or URL downloads are performed by the skill, minimizing supply-chain risk. The scripts prefer a local venv for Python/aiohttp and otherwise require the admin to have dependencies installed.
- Credentials
- noteThe skill uses several environment variables (VOICE_HTTPS_PORT, VOICE_HOST / VOICE_BIND_HOST, VOICE_ALLOWED_ORIGIN, VOICE_LANG and optional VOICE_CERT/VOICE_KEY/VOICE_TRANSCRIBE_URL/VOICE_GATEWAY_WS). Reading the gateway token from ~/.openclaw/openclaw.json is required to validate requests when exposed beyond localhost; this is a sensitive local config read but proportionate to implementing authenticated /transcribe access. The SKILL.md and scripts validate most user inputs; a small set of env vars (VOICE_CERT, VOICE_KEY, VOICE_TRANSCRIBE_URL, VOICE_GATEWAY_WS) are not validated in code but are intended to be set by an admin — acceptable for this threat model.
- Persistence & Privilege
- noteThe skill installs a persistent user systemd service (openclaw-voice-https.service) and a gateway startup hook (voice-input-inject) that re-injects the UI after updates. This is documented and limited to user-level privileges (no sudo/root). Persistence is expected for a service that must survive reboots and gateway updates, but it is a non-trivial system change — the included uninstall.sh attempts to fully revert these changes.
