Install
openclaw skills install videochat-withmeReal-time AI video chat that routes through your OpenClaw agent. Uses Groq Whisper (cloud STT), edge-tts (cloud TTS via Microsoft), and OpenClaw chatCompletions API for conversation. Your agent sees your camera, hears your voice, and responds with its own personality and memory. Requires: GROQ_API_KEY for speech recognition. Reads ~/.openclaw/openclaw.json for gateway port and auth token. Data flows: audio → Groq cloud (STT), TTS text → Microsoft cloud (edge-tts), camera frames (base64) + text → OpenClaw gateway → your configured LLM provider (may be cloud — frames leave the machine if using a cloud LLM). Installs a persistent launchd service (optional). Trigger phrases: "video chat", "voice call", "call me", "视频一下", "语音", "打电话给我", "我要和你视频", "videochat-withme".
openclaw skills install videochat-withmeReal-time video call with your OpenClaw agent — full personality, memory, and vision.
New users run once after installing the skill:
bash skills/videochat-withme/scripts/setup.sh
This handles everything: dependencies, Groq API key, SSL certs, launchd service.
mkdir -p ~/.openclaw/secrets
echo "your-key-here" > ~/.openclaw/secrets/groq_api_key.txt
Or set env var: export GROQ_API_KEY="your-key-here"Add to ~/.openclaw/openclaw.json:
{
"gateway": {
"http": {
"endpoints": {
"chatCompletions": { "enabled": true }
}
}
}
}
Then restart OpenClaw.
When the user requests a video/voice call:
Step 1: Check if service is running:
curl -sk https://localhost:8766/api/config 2>/dev/null || curl -s http://localhost:8766/api/config 2>/dev/null
Step 2: If no response, setup needed:
cat ~/.openclaw/secrets/groq_api_key.txt 2>/dev/null
echo "key" > ~/.openclaw/secrets/groq_api_key.txtbash skills/videochat-withme/scripts/setup.sh --auto --agent-name "YourName" --user-name "TheirName"
Step 3: Initiate the call based on context:
Determine how the user is connecting and pick the best method:
User is at the computer (message from webchat/desktop):
bash skills/videochat-withme/scripts/call.sh
This pops up a macOS incoming call notification → user clicks Accept → browser opens.
User is on mobile/remote (message from Telegram/phone): Pick the right URL automatically:
# Prefer Tailscale IP (works from any network)
TS_IP=$(tailscale ip -4 2>/dev/null)
# Fallback to local IP (same WiFi only)
LOCAL_IP=$(python3 -c "import socket; s=socket.socket(socket.AF_INET,socket.SOCK_DGRAM); s.connect(('8.8.8.8',80)); print(s.getsockname()[0]); s.close()" 2>/dev/null)
https://<tailscale-ip>:8766 (works everywhere)https://<local-ip>:8766 (same WiFi only)🎤 Voice → Groq Whisper (STT)
📷 Camera → base64 frame
↓
OpenClaw /v1/chat/completions → Your Agent
↓
edge-tts (TTS) → 🔊 Audio playback
Agent runs these automatically:
| Script | When |
|---|---|
setup.sh --auto | First use (service not running) |
call.sh | Every call request |
User can run manually if needed:
| Script | Purpose |
|---|---|
setup.sh | Interactive setup (without --auto) |
start.sh | Start service |
stop.sh | Stop service |
| Variable | Default | Description |
|---|---|---|
GROQ_API_KEY | (secrets file) | Groq API key for Whisper STT |
PORT | 8766 | Server port |
AGENT_NAME | AI Assistant | Display name for the agent |
USER_NAME | User | Display name for the user |
SSL_CERT | (auto-detect) | Path to SSL certificate |
SSL_KEY | (auto-detect) | Path to SSL private key |