WebChat Voice Proxy

Security checks across malware telemetry and agentic risk

Overview

The skill matches its stated local voice-input purpose and discloses its persistent local proxy, UI injection, and token-authenticated transcription behavior.

Prefer the newer split skills noted by the author. If installing this deprecated skill, keep it bound to 127.0.0.1 unless you truly need LAN access, remember that microphone audio is sent for transcription, avoid sharing URLs containing gateway tokens, and use the uninstall script if you no longer want the service, hook, or UI changes.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Rogue AgentSelf-Modification, Session Persistence
  • Taint TrackingDirect Taint Flow, Variable-Mediated Taint Flow, Credential Exfiltration Chain
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Findings (6)

Tainted flow: 'req' from os.environ.get (line 93, credential/environment) → urllib.request.urlopen (network output)

Critical
Category
Data Flow
Content
loop = asyncio.get_event_loop()
        data = await loop.run_in_executor(
            None,
            lambda: urllib.request.urlopen(req, timeout=120).read(),
        )
        return web.Response(
            body=data,
Confidence
93% confidence
Finding
lambda: urllib.request.urlopen(req, timeout=120).read(),

Missing User Warnings

Medium
Confidence
89% confidence
Finding
The code records microphone input and sends it to a transcription endpoint along with an Authorization bearer token, but the UI shown here does not disclose where audio is sent, when it leaves the browser, or that credentials are attached. This creates a privacy and trust risk: users may believe transcription is purely local while sensitive speech and auth material are transmitted to a service endpoint, including a localhost fallback that may proxy data to another process.

Missing User Warnings

Medium
Confidence
93% confidence
Finding
In continuous mode, transcribed text is automatically sent after a 2-second timeout unless the user focuses or clicks the textarea. That behavior can cause unintended transmission of sensitive or inaccurate content, especially because stop conditions include silence/keyword triggers rather than an explicit final confirmation step.

Missing User Warnings

Medium
Confidence
95% confidence
Finding
The troubleshooting guidance instructs users to place a gateway token directly in a URL query parameter. Query-string tokens are prone to leakage through browser history, bookmarks, screenshots, reverse-proxy/access logs, Referer headers, and shared troubleshooting artifacts, which can expose an authentication secret beyond the intended user.

Session Persistence

Medium
Category
Rogue Agent
Content
- **Localhost by default**: The HTTPS proxy binds to `127.0.0.1` only. It is **not** reachable from other devices on your network unless you explicitly set `VOICE_HOST` to a LAN IP.
- **LAN exposure**: Setting `VOICE_HOST=<LAN-IP>` exposes the proxy (and by extension the gateway WebSocket and transcription endpoint) to all devices on that network. Only do this on trusted networks.
- **Persistence**: This skill installs a **user systemd service** (`openclaw-voice-https.service`) that starts automatically on boot, and a **gateway hook** that re-injects the UI script after updates. Use `uninstall.sh` to fully revert.
- **Self-signed TLS**: The auto-generated certificate is not trusted by browsers. You will see a certificate warning on first access.

## Deploy
Confidence
84% confidence
Finding
systemd service** (`openclaw-voice-https.service`) that starts automatically on boot, and a **gateway hook** that re-injects the UI script after updates. Use `uninstall

Session Persistence

Medium
Category
Rogue Agent
Content
| Control UI HTML | `<npm-global>/openclaw/dist/control-ui/index.html` | Adds `<script>` tag for voice-input.js |
| Control UI asset | `<npm-global>/openclaw/dist/control-ui/assets/voice-input.js` | Copies mic button JS |
| Gateway config | `~/.openclaw/openclaw.json` | Adds HTTPS origin to `gateway.controlUi.allowedOrigins` |
| Systemd service | `~/.config/systemd/user/openclaw-voice-https.service` | Creates + enables persistent HTTPS proxy |
| Gateway hook | `~/.openclaw/hooks/voice-input-inject/` | Installs startup hook that re-injects JS after updates |
| Workspace files | `~/.openclaw/workspace/voice-input/` | Copies voice-input.js, https-server.py |
| TLS certs | `~/.openclaw/workspace/voice-input/certs/` | Auto-generated self-signed cert on first run |
Confidence
90% confidence
Finding
Systemd service | `~/.config/systemd/user/openclaw-voice-https.service` | Create

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal