Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

WebChat Voice Proxy

v0.2.3

⚠️ DEPRECATED — This skill has been split into two separate skills for better modularity: **webchat-https-proxy** (HTTPS/WSS reverse proxy) and **webchat-voi...

0· 877·1 current·1 all-time
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
The skill claims to provide an HTTPS/WSS proxy, inject a mic UI, and forward audio to a local transcription service — and the included files (https-server.py, voice-input.js, deploy/uninstall scripts, hook) implement exactly those behaviors. Required modifications (index.html injection, copying voice-input.js, systemd user service, editing ~/.openclaw/openclaw.json allowedOrigins) are consistent with enabling a proxied Control UI and are expected for this feature.
Instruction Scope
The runtime instructions and scripts intentionally modify gateway config (append allowedOrigins), write into the npm-installed Control UI, install a user systemd service, and create a gateway startup hook that re-injects the UI script. They also read the gateway auth token from ~/.openclaw/openclaw.json (https-server.py) and the token from Control UI localStorage (voice-input.js) to authenticate /transcribe requests. This is necessary for the feature but means the skill touches and reads local gateway configuration and Control UI files — review these changes before running deploy.sh.
Install Mechanism
There is no remote download/extract install; deploy.sh copies bundled assets into user workspace and system locations, creates a systemd user unit, and installs local hook files. No external archive or URL downloads are performed by the skill, minimizing supply-chain risk. The scripts prefer a local venv for Python/aiohttp and otherwise require the admin to have dependencies installed.
Credentials
The skill uses several environment variables (VOICE_HTTPS_PORT, VOICE_HOST / VOICE_BIND_HOST, VOICE_ALLOWED_ORIGIN, VOICE_LANG and optional VOICE_CERT/VOICE_KEY/VOICE_TRANSCRIBE_URL/VOICE_GATEWAY_WS). Reading the gateway token from ~/.openclaw/openclaw.json is required to validate requests when exposed beyond localhost; this is a sensitive local config read but proportionate to implementing authenticated /transcribe access. The SKILL.md and scripts validate most user inputs; a small set of env vars (VOICE_CERT, VOICE_KEY, VOICE_TRANSCRIBE_URL, VOICE_GATEWAY_WS) are not validated in code but are intended to be set by an admin — acceptable for this threat model.
Persistence & Privilege
The skill installs a persistent user systemd service (openclaw-voice-https.service) and a gateway startup hook (voice-input-inject) that re-injects the UI after updates. This is documented and limited to user-level privileges (no sudo/root). Persistence is expected for a service that must survive reboots and gateway updates, but it is a non-trivial system change — the included uninstall.sh attempts to fully revert these changes.
Scan Findings in Context
[3.2_transcribe_auth] expected: Audit noted medium risk originally because exposing /transcribe without auth would be dangerous; code and audit show /transcribe enforces Bearer token checking against the gateway token read from ~/.openclaw/openclaw.json when a gateway token exists. This auth is necessary and expected to protect the transcription endpoint when the proxy is exposed.
[4.5_tls_key_permissions] expected: Audit found TLS key file permissions and reports deploy/https-server.py explicitly sets key file mode to 0600 after generation — appropriate for user-created certs.
[7.1_input_validation_port] expected: VOICE_HTTPS_PORT and VOICE_HOST are validated both in deploy.sh and in https-server.py; this defends against command/shell injection and is expected for a script that interpolates these values into systemd and config files.
[7.6_unvalidated_optional_envs] unexpected: VOICE_CERT, VOICE_KEY, VOICE_TRANSCRIBE_URL and VOICE_GATEWAY_WS are not validated by https-server.py. The audit considers this a low/open issue but acceptable because these are admin-set local values; if an attacker could set these env vars they would already have code execution. Admins should still review values when customizing.
[hooks_execfile_and_inject] expected: The gateway hook runs inject.sh via execFileSync and the inject script copies voice-input.js and patches index.html. This behavior is required to make the UI persistent across updates; the audit confirms execFileSync is used safely (no shell interpolation) and sed uses hardcoded strings.
[assets_browser_localtoken] expected: voice-input.js reads the gateway token from Control UI localStorage to attach Authorization headers to /transcribe — this is necessary to perform authenticated requests from the browser and is consistent with the proxy's auth design. It means the browser will send a locally-stored token; administrators should ensure localStorage usage is acceptable for their security posture.
Assessment
What to consider before installing: - This skill makes persistent, local changes: it copies a JS asset into your Control UI, injects a <script> tag into index.html, appends allowedOrigins to ~/.openclaw/openclaw.json, installs a user-level systemd service, and installs a gateway startup hook that re-injects the script after updates. Review these exact changes (deploy.sh, inject.sh, uninstall.sh) before running deploy.sh. - By default the proxy binds to 127.0.0.1 (local-only). Only set VOICE_HOST to a LAN IP if you intentionally want other devices on the network to reach the proxy — that will expose the gateway WebSocket and transcription endpoint to the LAN. If exposed, the proxy enforces a Bearer token matched to the gateway token (read from your openclaw.json) — but you must ensure the LAN is trusted and consider rate-limiting/monitoring. - The skill reads your gateway auth token from ~/.openclaw/openclaw.json and the client reads it from Control UI localStorage to authenticate /transcribe calls. This is necessary for proper authorization but means the skill accesses a local sensitive value — back up your config if you are concerned. - The proxy generates a self-signed cert by default (you'll see browser warnings). You can supply VOICE_CERT/VOICE_KEY to use your own certs. - The skill requires a local faster-whisper transcription service (http://127.0.0.1:18790/transcribe) to be running before deploying; the scripts validate aiohttp and prefer a venv. - The package contains an uninstall.sh which attempts to remove the service, hook, injected script, and certs; test uninstall in a safe environment if you need guarantees. - Recommended steps: inspect deploy.sh, inject.sh, https-server.py and voice-input.js yourself; run deploy.sh with defaults (localhost) first; do not set VOICE_HOST to a LAN IP unless you understand the exposure; keep backups of ~/.openclaw/openclaw.json and your Control UI index.html before installing.
hooks/handler.ts:14
Shell command execution detected (child_process).
Patterns worth reviewing
These patterns may indicate risky behavior. Check the VirusTotal and OpenClaw results above for context-aware analysis before installing.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

Env[object Object], [object Object], [object Object], [object Object]
faster-whispervk974t2vp3f0wgarq498xd5kzeh81qpdqfreevk977tytycf9298wdk8kp6fevnn81a5bnhttpsvk974t2vp3f0wgarq498xd5kzeh81qpdqi18nvk974t2vp3f0wgarq498xd5kzeh81qpdqlatestvk97edgtwjktxd0g215etdt89kd839v9wlocalvk974t2vp3f0wgarq498xd5kzeh81qpdqmicrophonevk974t2vp3f0wgarq498xd5kzeh81qpdqno-apivk974t2vp3f0wgarq498xd5kzeh81qpdqopenclawvk974t2vp3f0wgarq498xd5kzeh81qpdqpttvk974t2vp3f0wgarq498xd5kzeh81qpdqpush-to-talkvk974t2vp3f0wgarq498xd5kzeh81qpdqsecurevk974t2vp3f0wgarq498xd5kzeh81qpdqsttvk974t2vp3f0wgarq498xd5kzeh81qpdqvoicevk974t2vp3f0wgarq498xd5kzeh81qpdqwebchatvk974t2vp3f0wgarq498xd5kzeh81qpdqwssvk974t2vp3f0wgarq498xd5kzeh81qpdq
877downloads
0stars
20versions
Updated 19h ago
v0.2.3
MIT-0

WebChat Voice Proxy

Set up a reboot-safe voice stack for OpenClaw WebChat (including the current polished mic/stop/hourglass UI states):

  • HTTPS Control UI on port 8443
  • /transcribe proxy to local faster-whisper service
  • WebSocket passthrough to gateway (ws://127.0.0.1:18789)
  • Voice button script injection into Control UI
  • Real-time VU meter: button shadow/scale reacts to voice level
  • Push-to-Talk: hold mic button to record, release to send (default mode)
  • Toggle mode: click to start, click to stop (switch via double-click on mic button)
  • Keyboard shortcuts: Ctrl+Space Push-to-Talk, Ctrl+Shift+M start/stop continuous recording
  • Localized UI: auto-detects browser language (English, German, Chinese built-in), customizable

Prerequisites (required)

This skill requires a local faster-whisper HTTP service. Expected default:

  • URL: http://127.0.0.1:18790/transcribe
  • systemd user service: openclaw-transcribe.service

Verify before deployment:

systemctl --user is-active openclaw-transcribe.service
curl -s -o /dev/null -w '%{http_code}\n' http://127.0.0.1:18790/transcribe -X POST -H 'Content-Type: application/octet-stream' --data-binary 'x'

If this dependency is missing, set up faster-whisper first (model load + HTTP endpoint), then run this skill.

Related skills:

  • faster-whisper-local-service (backend prerequisite)
  • webchat-voice-full-stack (meta-installer that deploys both backend + proxy)

Workflow

  1. Ensure transcription service exists and is running (openclaw-transcribe.service).
  2. Deploy voice-input.js to Control UI assets and inject script tag into index.html.
  3. Configure gateway allowed origin for external HTTPS UI.
  4. Run HTTPS+WSS proxy as persistent user systemd service (openclaw-voice-https.service).
  5. Verify pairing/token/origin errors and resolve in order.

Security Notes

  • Localhost by default: The HTTPS proxy binds to 127.0.0.1 only. It is not reachable from other devices on your network unless you explicitly set VOICE_HOST to a LAN IP.
  • LAN exposure: Setting VOICE_HOST=<LAN-IP> exposes the proxy (and by extension the gateway WebSocket and transcription endpoint) to all devices on that network. Only do this on trusted networks.
  • Persistence: This skill installs a user systemd service (openclaw-voice-https.service) that starts automatically on boot, and a gateway hook that re-injects the UI script after updates. Use uninstall.sh to fully revert.
  • Self-signed TLS: The auto-generated certificate is not trusted by browsers. You will see a certificate warning on first access.

Deploy

Run (localhost only — default, most secure):

bash scripts/deploy.sh

Or expose on LAN (required to access from other devices):

VOICE_HOST=10.0.0.42 VOICE_HTTPS_PORT=8443 VOICE_LANG=de bash scripts/deploy.sh

When run interactively without VOICE_LANG, the script will ask you to choose a UI language (auto, en, de, zh). Set VOICE_LANG=auto to skip the prompt.

This script is idempotent.

Quick verify

Run:

bash scripts/status.sh

Expected:

  • both services active
  • injection present
  • https:200

Common fixes

  • 404 /chat?... → SPA fallback missing in HTTPS proxy.
  • origin not allowed → ensure deploy used correct VOICE_HOST and added matching HTTPS origin to gateway.controlUi.allowedOrigins.
  • token missing → open URL with ?token=... once.
  • pairing required → approve pending device via openclaw devices approve <requestId> --token <gateway-token>.
  • Mic breaks after reboot → cert paths must be persistent (not /tmp).
  • No transcription result → check local faster-whisper endpoint first.

See references/troubleshooting.md for exact commands.

What this skill modifies

Before installing, be aware of all system changes deploy.sh makes:

WhatPathAction
Control UI HTML<npm-global>/openclaw/dist/control-ui/index.htmlAdds <script> tag for voice-input.js
Control UI asset<npm-global>/openclaw/dist/control-ui/assets/voice-input.jsCopies mic button JS
Gateway config~/.openclaw/openclaw.jsonAdds HTTPS origin to gateway.controlUi.allowedOrigins
Systemd service~/.config/systemd/user/openclaw-voice-https.serviceCreates + enables persistent HTTPS proxy
Gateway hook~/.openclaw/hooks/voice-input-inject/Installs startup hook that re-injects JS after updates
Workspace files~/.openclaw/workspace/voice-input/Copies voice-input.js, https-server.py
TLS certs~/.openclaw/workspace/voice-input/certs/Auto-generated self-signed cert on first run

The injected JS (voice-input.js) runs inside the Control UI and interacts with the chat input. Review the source before deploying.

Mic Button Controls

ActionEffect
Hold (PTT mode)Record while held, transcribe on release
Click (Toggle mode)Start recording / stop and transcribe
Double-clickSwitch between PTT and Toggle mode
Right-clickToggle beep sound on/off
Ctrl+Space (hold)Push-to-Talk via keyboard (works even with text field focused)
Ctrl+Shift+MStart/stop recording (transcribes on stop)
Ctrl+Shift+BStart/stop live transcription [beta] — text appears in real-time, auto-sends after 2s review, stops on 5s silence or "Stop Hugo" keyword

The current mode and available actions are shown in the button tooltip on hover.

Language / i18n

The UI automatically detects the browser language and shows tooltips, toasts, and placeholder text in the matching language.

Built-in languages: English (en), German (de), Chinese (zh)

Override language

Set a language override in the browser console:

localStorage.setItem('oc-voice-lang', 'de');  // force German
localStorage.setItem('oc-voice-lang', 'zh');  // force Chinese
localStorage.removeItem('oc-voice-lang');      // back to auto-detect

Then reload the page.

Add a custom language

Edit voice-input.js and add a new entry to the I18N object. Use assets/i18n.json as a template — it contains all translation keys with the built-in translations.

Example for adding French:

const I18N = {
  // ... existing entries ...
  fr: {
    tooltip_ptt: "Maintenir pour parler",
    tooltip_toggle: "Cliquer pour démarrer/arrêter",
    tooltip_next_toggle: "Mode clic",
    tooltip_next_ptt: "Push-to-Talk",
    tooltip_beep_off: "Désactiver le bip",
    tooltip_beep_on: "Activer le bip",
    tooltip_dblclick: "Double-clic",
    tooltip_rightclick: "Clic droit",
    toast_ptt: "Push-to-Talk",
    toast_toggle: "Mode clic",
    toast_beep_on: "Bip activé",
    toast_beep_off: "Bip désactivé",
    placeholder_suffix: " — Voix : (Ctrl+Espace Push-To-Talk, Ctrl+Shift+M enregistrement continu)"
  }
};

After editing, redeploy with bash scripts/deploy.sh to copy the updated JS to the Control UI.

CORS Policy

The /transcribe proxy endpoint uses a configurable Access-Control-Allow-Origin header. Set VOICE_ALLOWED_ORIGIN env var to restrict. Default: https://<VOICE_HOST>:<VOICE_HTTPS_PORT>.

Uninstall

To fully revert all changes:

bash scripts/uninstall.sh

This will:

  1. Stop and remove openclaw-voice-https.service
  2. Remove the gateway startup hook
  3. Remove voice-input.js from Control UI and undo the index.html injection
  4. Remove the HTTPS origin from gateway config
  5. Restart the gateway
  6. Remove TLS certificates
  7. Remove workspace runtime files (voice-input.js, https-server.py, i18n.json)

The faster-whisper backend is not touched by uninstall — remove it separately via faster-whisper-local-service if needed.

Comments

Loading comments...