Voice Bridge Light

v1.0.1

提供兼容OpenAI接口的轻量级本地STT（Whisper）和TTS（Edge TTS/Piper）语音桥接HTTP服务。

⭐ 0· 185·2 current·2 all-time

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for fangbb-coder/voice-bridge-light.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "Voice Bridge Light" (fangbb-coder/voice-bridge-light) from ClawHub.
Skill page: https://clawhub.ai/fangbb-coder/voice-bridge-light
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install voice-bridge-light

ClawHub CLI

Package manager switcher

npx clawhub@latest install voice-bridge-light

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Benign

high confidence

✓

Purpose & Capability

Name/description, SKILL.md, requirements.txt and api_server.py are coherent: the code implements Whisper-based STT and Edge/Piper TTS and the listed Python packages match those capabilities. There are no unrelated environment variables, binaries, or credentials requested that would be inconsistent with a local TTS/STT bridge.

ℹ

Instruction Scope

Runtime instructions stay within the stated purpose (install Python deps, run the Flask server, call /audio/speech and /audio/transcriptions). Notable operational points the user should be aware of: the default host is 0.0.0.0 and port 18790 (exposes the HTTP API to the network unless firewall/binding changed), Edge TTS uses Microsoft online services (network access), and the systemd example runs as root in /root/.openclaw/... (poor practice but not hidden behavior). The code writes uploaded audio to a temporary file and deletes it after transcription (expected for STT).

✓

Install Mechanism

There is no registry-level install script, but the bundle includes requirements.txt and skill.yaml with a pip install -r requirements.txt instruction. Dependencies are standard PyPI packages (no obscure download URLs or extracted archives). This is a normal, low-risk install mechanism for Python packages, though some packages will pull large model artifacts at runtime.

✓

Credentials

No required secrets or credentials are declared. The environment variables documented (host, port, engine selection, model paths, voice selection, model size) are appropriate for configuring a local TTS/STT service. The skill does not request unrelated credentials or access to other services beyond optional Edge TTS network use.

ℹ

Persistence & Privilege

The skill is not always-enabled and does not request elevated platform privileges in its manifest. The SKILL.md includes an optional systemd service example that runs the server as root and in /root/.openclaw/workspace — running as root is unnecessary and increases risk; this is an operational recommendation in the docs rather than hidden behavior in the code.

Assessment

This skill appears to do what it claims, but consider these precautions before installing: - Network exposure: by default the server listens on 0.0.0.0:18790. If you don't want the service reachable from other hosts, bind to localhost or restrict access with a firewall. - Edge TTS uses Microsoft online services (requires outbound network). If you need fully offline TTS, configure TTS_ENGINE=piper and provide local Piper models. - Model downloads and memory usage: faster-whisper and Piper may download large model files and use hundreds of MB of RAM. Ensure you have disk and RAM headroom. - Avoid running the service as root. If you use the provided systemd unit, change User to a non-privileged account and set WorkingDirectory appropriately. - Verify Python package sources (PyPI) and review package reputations if you require higher assurance. - If you need strong confidentiality, review access controls for the HTTP endpoint and consider adding authentication or only binding to localhost.

Like a lobster shell, security has layers — review code before you run it.

latestvk97d9gw5pn8y9scaw1z6tft6th8340sb

185downloads

0stars

2versions

Updated 1mo ago

v1.0.1

MIT-0

Voice Bridge Light

Lightweight offline voice bridging service providing OpenAI-compatible STT/TTS HTTP API.

Features

TTS Text-to-Speech: Supports Edge TTS (online) and Piper (local)
STT Speech Recognition: Based on Whisper local recognition
OpenAI Compatible API: Compatible with OpenAI Audio API
Lightweight Deployment: Minimal dependencies, easy to install

Usage

Installation

pip install -r requirements.txt

Start Service

Default using Edge TTS:

python api_server.py

Using Piper (model required):

TTS_ENGINE=piper PIPER_MODEL=models/piper/zh_CN-huayan-medium.onnx python api_server.py

API Endpoints

Endpoint	Method	Description
`GET /health`	GET	Health check
`POST /audio/speech`	POST	TTS speech synthesis
`POST /audio/transcriptions`	POST	STT speech recognition

Configuration Environment Variables

Variable	Default	Description
`VOICE_BRIDGE_HOST`	`0.0.0.0`	Listen address
`VOICE_BRIDGE_PORT`	`18790`	Listen port
`TTS_ENGINE`	`edge`	TTS engine: `edge` or `piper`
`EDGE_VOICE`	`zh-CN-XiaoxiaoNeural`	Edge TTS voice
`PIPER_MODEL`	`models/piper/zh_CN-huayan-medium.onnx`	Piper model path
`STT_MODEL`	`base`	Whisper model size

TTS Request Example

curl -X POST http://localhost:18790/audio/speech \
  -H "Content-Type: application/json" \
  -d '{
    "input": "Hello, world!",
    "voice": "zh-CN-XiaoxiaoNeural",
    "response_format": "mp3"
  }' \
  --output speech.mp3

STT Request Example

curl -X POST http://localhost:18790/audio/transcriptions \
  -F "file=@speech.mp3" \
  -H "Content-Type: multipart/form-data"

OpenClaw Integration

Configure in openclaw.json:

{
  "tts": {
    "enabled": true,
    "provider": "local-piper",
    "baseUrl": "http://127.0.0.1:18790",
    "apiKey": "local",
    "voice": "zh-CN-XiaoxiaoNeural"
  }
}

Dependencies

Python 3.8+
edge-tts (Edge TTS)
faster-whisper (Whisper STT)
soundfile (audio processing)
Flask + Flask-CORS (web service)

Service Management

systemd Service (Recommended)

[Unit]
Description=Voice Bridge Light - STT/TTS HTTP API
After=network.target

[Service]
Type=simple
User=root
WorkingDirectory=/root/.openclaw/workspace/skills/voice-bridge-light
ExecStart=/usr/bin/python3 api_server.py
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target

Enable and start:

systemctl daemon-reload
systemctl enable voice-bridge-light.service
systemctl start voice-bridge-light.service

Performance

TTS latency: < 1s (Edge TTS requires network)
STT latency: depends on audio length, real-time CPU
Memory usage: ~300-500MB (mainly from Whisper model)

Notes

Edge TTS requires internet access to Microsoft services
Piper requires downloading model files (first use)
Whisper model loads slowly on first run, recommend warm-up
Production environment recommended to use systemd management

License

MIT

Comments

Loading comments...