Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

Voice Agent Pro

v1.0.7

Gives any OpenClaw agent a complete voice layer via ElevenLabs. Clones the principal's voice from audio samples, converts any text to MP3 audio (VSL, podcast...

0· 192·0 current·0 all-time
byWesley Armando@georges91560

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for georges91560/voice-agent-pro-v1.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "Voice Agent Pro" (georges91560/voice-agent-pro-v1) from ClawHub.
Skill page: https://clawhub.ai/georges91560/voice-agent-pro-v1
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install voice-agent-pro-v1

ClawHub CLI

Package manager switcher

npx clawhub@latest install voice-agent-pro-v1
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Suspicious
high confidence
!
Purpose & Capability
The skill claims to be a voice/TTS/calling layer (ElevenLabs + Twilio + Telegram), which explains needing those credentials — but the registry metadata (top-level requirements) lists no required env vars or config paths while SKILL.md metadata and README explicitly require multiple secrets and many read/write paths. This mismatch (registry says 'none' but SKILL.md requires ELEVENLABS_*, TWILIO_*, TELEGRAM_*, and specific host/workspace paths) is an incoherence that should be resolved.
!
Instruction Scope
SKILL.md instructs the agent to install packages inside the OpenClaw container (pip install elevenlabs, twilio; apt-get ffmpeg), to read/write /workspace and host .env files, and offers an automated 'virtual-desktop' path that has the agent use an active Google browser session to create an API key. The instructions therefore go beyond simple TTS: they instruct host-level modifications and recommend automated browser actions that could access third‑party sessions — this expands the attack surface and scope of what the agent may touch.
Install Mechanism
There is no formal install spec in the registry (instruction-only) but SKILL.md expects the agent to run pip/apt commands inside the container. Installing third-party Python packages (elevenlabs, twilio) and apt packages via instructions is common but carries supply-chain risk; no external arbitrary download URLs are used. The use of pip with --break-system-packages is noted (less safe on some systems).
!
Credentials
The requested environment access is extensive but mostly aligned with the declared functionality (ElevenLabs API key/voice id for cloning/TTS, Twilio creds for calls, Telegram token/chat for notifications). However SKILL.md and README also instruct writing credentials into host-level .env (/docker/openclaw-yyvg/.env) and to read many workspace files; the registry incorrectly declared no env requirements. Writing/updating a host .env for the agent is a privilege that affects other services and is disproportionate unless the user intends that change.
!
Persistence & Privilege
The skill is not always:true, but instructions explicitly tell the agent (or operator) to write credentials into host docker .env and to restart the OpenClaw container, which modifies system-wide agent configuration. The SKILL.md also suggests using an automated 'virtual-desktop' to create keys from an active Google session — combining autonomous agent actions with access to browser sessions and host env files increases risk. The skill does write audit/error logs to workspace paths, further persisting state.
What to consider before installing
This skill can do what it says (clone voices, generate TTS, place calls) but there are red flags you should address before installing: - Resolve the metadata mismatch: the registry lists NO required env vars but SKILL.md requests ELEVENLABS_*, TWILIO_*, and TELEGRAM_* — assume those credentials are needed and treat them as sensitive. - Do NOT let the skill or agent write credentials into your host-wide .env or other global config unless you understand the impact. Prefer creating dedicated, limited-scope API keys and store them in a separate, isolated config for this skill. - Avoid enabling the 'virtual-desktop' automated path unless you trust the virtual-desktop skill and accept that the agent may access an active browser session (this can expose Google sessions and other credentials). - Test in an isolated environment (throwaway container/VPS) first. Use temporary API keys, restrict Twilio numbers and callback URLs, and monitor network traffic. - Review the included voice_generator.py and any Twilio/Telegram integration code (the repository contains code) to confirm it only calls expected endpoints and that logs/audits do not leak sensitive inputs. - If you plan to make calls, check legal/compliance implications for cloned voices and automated calling in your jurisdiction; ensure consent and opt-out handling. If you cannot validate the above, treat this skill as high-risk and consider alternative, officially supported integrations or keep it disabled until you can audit and scope its permissions.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

🎙️ Clawdis
latestvk9789ttr1rvc5z23n4gqp66vv58331qf
192downloads
0stars
7versions
Updated 1h ago
v1.0.7
MIT-0

Voice Agent Pro V3 — Autonomous Voice Layer

"The most trusted voice in any room is the one that sounds like you."

This skill gives the principal a voice — their own voice — deployed at scale.

LAYER 1 — VOICE SETUP
  Clones the principal's voice from MP3 samples via ElevenLabs API
  Requires ELEVENLABS_API_KEY and ELEVENLABS_VOICE_ID in .env
  Full setup guide with all commands: references/setup_guide.md

LAYER 2 — TEXT TO SPEECH
  Converts any text to MP3 using the principal's cloned voice
  VSL scripts, podcast intros, video narrations, email audio versions

LAYER 3 — CONVERSATIONAL AGENT (with Twilio)
  Outbound calls to leads — automated follow-up
  Inbound calls — answers 24/7, qualifies, reports

SETUP — Required Before First Use

Step 1 — Install dependencies

The agent runs these commands inside the OpenClaw container. If you prefer to run them manually from your VPS host, use: docker exec openclaw-yyvg-openclaw-1 pip install elevenlabs --break-system-packages

# Inside the container (agent runs this directly)
pip install elevenlabs --break-system-packages
pip install twilio --break-system-packages
apt-get update && apt-get install -y ffmpeg

# Verify
ffmpeg -version | head -1
python3 -c "from elevenlabs.client import ElevenLabs; print('✅ SDK ready')"

Note: --break-system-packages is required on Ubuntu 24.04 / Debian 12+ containers. If you get an "externally-managed-environment" error, this flag resolves it. On older systems, plain pip install elevenlabs works.

Step 2 — Access ElevenLabs

OPTION A — Via virtual-desktop (if installed)
  If the virtual-desktop skill is installed and a Google session
  is active in the browser, the agent can navigate elevenlabs.io
  and create the API key automatically:
  → Go to: https://elevenlabs.io/app/sign-in
  → Click "Continue with Google" (uses active session)
  → Navigate: Developers → API Keys → Create API Key
  → Copy the key and run apply-config (Step 6)

OPTION B — Manual (recommended for first setup)
  → Go to: https://elevenlabs.io/app/settings/api-keys
  → Click "Create API Key" → name it → copy it
  → Add to your agent .env file: ELEVENLABS_API_KEY=sk_...
# Verify it works
curl -s https://api.elevenlabs.io/v1/user \
  -H "xi-api-key: $ELEVENLABS_API_KEY" | python3 -m json.tool
# Expected: JSON with subscription info — if 401, key is wrong

Step 3 — Clone your voice

Provide 3 MP3 files of your voice (30-60 seconds each, clear audio) in /workspace/voice/samples/ before running this step.

Via Python SDK

from elevenlabs.client import ElevenLabs
import json, os

client = ElevenLabs(api_key=os.environ["ELEVENLABS_API_KEY"])
voice = client.voices.ivc.create(
    name="[AGENT_VOICE_NAME]",
    description="[PRINCIPAL_NAME] cloned voice",
    files=[
        "/workspace/voice/samples/sample_01.mp3",
        "/workspace/voice/samples/sample_02.mp3",
        "/workspace/voice/samples/sample_03.mp3",
    ],
)
print(f"Voice ID: {voice.voice_id}")

# Save to config.json
with open("/workspace/voice/config.json") as f:
    config = json.load(f)
config["ELEVENLABS_VOICE_ID"] = voice.voice_id
with open("/workspace/voice/config.json", "w") as f:
    json.dump(config, f, indent=2)
print("✅ Voice ID saved to config.json")

Via curl

VOICE_ID=$(curl -s -X POST https://api.elevenlabs.io/v1/voices/add \
  -H "xi-api-key: $ELEVENLABS_API_KEY" \
  -F "name=[AGENT_VOICE_NAME]" \
  -F "description=[PRINCIPAL_NAME] cloned voice" \
  -F "files=@/workspace/voice/samples/sample_01.mp3" \
  -F "files=@/workspace/voice/samples/sample_02.mp3" \
  -F "files=@/workspace/voice/samples/sample_03.mp3" \
  | python3 -c "import sys,json; print(json.load(sys.stdin)['voice_id'])")

echo "Voice ID: $VOICE_ID"
# Add to your agent .env: ELEVENLABS_VOICE_ID=$VOICE_ID

Step 4 — List voices (verify clone)

curl -s https://api.elevenlabs.io/v1/voices \
  -H "xi-api-key: $ELEVENLABS_API_KEY" \
  | python3 -c "
import sys, json
for v in json.load(sys.stdin)['voices']:
    print(f"{v['voice_id']} | {v['name']} | {v['category']}")
"

Step 5 — Test the clone

python3 -c "
from elevenlabs.client import ElevenLabs
import os, json

with open('/workspace/voice/config.json') as f:
    cfg = json.load(f)

client = ElevenLabs(api_key=cfg['ELEVENLABS_API_KEY'])
audio = client.text_to_speech.convert(
    text='Voice clone test successful.',
    voice_id=cfg['ELEVENLABS_VOICE_ID'],
    model_id='eleven_multilingual_v2',
    output_format='mp3_44100_128',
)
import os; os.makedirs('/workspace/voice/output', exist_ok=True)
with open('/workspace/voice/output/test_clone.mp3', 'wb') as f:
    for chunk in audio:
        f.write(chunk)
print('✅ Test audio: /workspace/voice/output/test_clone.mp3')
"

Step 6 — Apply config and verify

# Apply credentials to config.json (no container restart needed)
python3 /workspace/voice/scripts/voice_generator.py apply-config \
  --api-key "$ELEVENLABS_API_KEY" \
  --voice-id "$ELEVENLABS_VOICE_ID"

# Verify everything is ready
python3 /workspace/voice/scripts/voice_generator.py status
# Expected:
#   API Key:    ✅ configured
#   Voice ID:   ✅ abc123...

The skill reads credentials from config.json at runtime — no container restart needed after updating credentials.

For browser dashboard navigation guide (step-by-step with screenshots): references/setup_guide.md


PHASE 1 — TEXT TO SPEECH

Converts any text to audio using the principal's cloned voice.

Use Cases

VSL (Video Sales Letter)
  Input:  /workspace/voice/scripts/vsl_[offer].md
  Output: /workspace/voice/output/vsl_[offer].mp3
  Use:    record your VSL once — never again

PODCAST INTRO / OUTRO
  Input:  /workspace/voice/scripts/podcast_[episode].md
  Output: /workspace/voice/output/podcast_[episode].mp3

VIDEO NARRATION
  Input:  text from content-creator-pro queue
  Output: MP3 ready for CapCut / video editor

EMAIL AUDIO VERSION
  Input:  email text from acquisition-master sequences
  Output: MP3 attached or linked in email

SOCIAL AUDIO CLIPS
  Input:  hook text from content-creator-pro
  Output: 15-30 second MP3 for Instagram, Twitter Spaces

TTS Models

eleven_flash_v2_5        → 75ms latency — use for real-time / calls
eleven_multilingual_v2   → best quality — use for VSL / podcasts
eleven_v3                → most expressive — use for storytelling content

TTS Process

1. Read script from /workspace/voice/scripts/[name].md
2. Split into chunks of max 900 characters (sentence boundaries)
3. Call ElevenLabs TTS API for each chunk
4. Concatenate chunks with ffmpeg → single MP3
5. Save to /workspace/voice/output/[name].mp3
6. Log to AUDIT.md: "TTS generated: [name].mp3 — [duration]s"
7. Notify principal via Telegram with file path

CLI Usage

# Generate from text
python3 /workspace/voice/scripts/voice_generator.py tts \
  --text "Hello, this is [PRINCIPAL_NAME]." \
  --output /workspace/voice/output/hello.mp3

# Generate from script file
python3 /workspace/voice/scripts/voice_generator.py tts \
  --script /workspace/voice/scripts/vsl_offer.md \
  --model eleven_multilingual_v2

# Check status
python3 /workspace/voice/scripts/voice_generator.py status

PHASE 2 — CONVERSATIONAL CALLS (requires Twilio)

Setup Twilio

# Add to your agent .env file:
TWILIO_ACCOUNT_SID=ACxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
TWILIO_AUTH_TOKEN=your_auth_token
TWILIO_PHONE_NUMBER=+1234567890

# Twilio account: console.twilio.com
# Phone number: ~$1/month

Connect Twilio to ElevenLabs Agent

1. Go to: https://elevenlabs.io/app/conversational-ai
2. Click "Create Agent"
3. Name: "[PRINCIPAL_NAME] Sales Agent"
4. Voice: select "[AGENT_VOICE_NAME]"
5. Agent instructions: paste content from templates/agent_prompt.md
6. Save → copy Agent ID → save to config.json
7. Tab "Phone Numbers" → "Add Phone Number"
8. Enter TWILIO_ACCOUNT_SID + TWILIO_AUTH_TOKEN
9. Select TWILIO_PHONE_NUMBER
10. ElevenLabs configures Twilio automatically

Outbound Calls — Lead Follow-up

Triggered by:
  → acquisition-master: lead didn't open email after 3 days
  → funnel-builder: lead clicked pricing page but didn't buy
  → Manual: principal requests call to specific number

Call flow:
  1. Agent reads lead data from /workspace/voice/calls/pending/
  2. Personalizes call script with lead name + context
  3. Initiates outbound call via ElevenLabs + Twilio
  4. Conversation in real-time (principal's cloned voice)
  5. Transcript saved to /workspace/voice/calls/history/
  6. Outcome: interested / not_interested / callback / voicemail
  7. Telegram notification with transcript summary

Call schedule: Mon-Fri 9h-19h only
Max voicemails: 1 per lead per week

Inbound Calls — 24/7 Qualification

When someone calls the Twilio number:
  → Agent answers in principal's cloned voice
  → 3 qualification questions:
    Q1: "What's your current situation with [niche problem]?"
    Q2: "Have you tried to solve this before?"
    Q3: "What would your ideal outcome look like?"
  → Score 8-10: hot lead → books call via Calendly
  → Score 5-7:  warm lead → sends free resource by SMS
  → Score 1-4:  cold lead → polite close
  → Transcript + score → /workspace/voice/calls/history/
  → Telegram: "📞 Inbound call — [score]/10 — [summary]"

Qualification Scoring

Score 8-10 → hot lead → immediate Telegram alert to principal
Score 5-7  → warm lead → add to nurture sequence
Score 1-4  → cold lead → send free resource, no follow-up 30 days

Call CLI

# List pending calls
python3 /workspace/voice/scripts/voice_generator.py calls --action list

# Weekly call summary
python3 /workspace/voice/scripts/voice_generator.py calls --action summary

Troubleshooting Voice Clone

PROBLEM: "Insufficient credits" / 403 error
  Solution: Check plan at elevenlabs.io/app/subscription
  IVC requires Starter ($5/month) minimum
  Action: upgrade at https://elevenlabs.io/pricing

PROBLEM: Clone sounds robotic or wrong
  Causes:
  → Audio files too short (< 30 seconds each)
  → Background noise in samples
  → Multiple speakers in same file
  Solution:
  → Delete the bad clone:
    curl -X DELETE https://api.elevenlabs.io/v1/voices/$VOICE_ID \
      -H "xi-api-key: $ELEVENLABS_API_KEY"
  → Record better samples (quieter, longer, more natural)
  → Re-run cloning

PROBLEM: Voice ID not found when calling TTS
  Solution:
    curl -s https://api.elevenlabs.io/v1/voices \
      -H "xi-api-key: $ELEVENLABS_API_KEY" \
      | python3 -m json.tool | grep -A2 "[AGENT_VOICE_NAME]"

PROBLEM: "Invalid API key" (401 on all requests)
  Solution: Regenerate at https://elevenlabs.io/app/settings/api-keys
  Update .env and restart agent container

PROBLEM: ffmpeg not found
  Solution: apt-get update && apt-get install -y ffmpeg
  Verify: ffmpeg -version

Quick Reference — API Endpoints

Base URL : https://api.elevenlabs.io
Auth     : xi-api-key: YOUR_KEY

GET    /v1/user                 → account + subscription info
GET    /v1/voices               → list all voices
POST   /v1/voices/add           → create IVC clone (multipart form)
DELETE /v1/voices/{id}          → delete a voice
POST   /v1/text-to-speech/{id}  → generate audio (JSON body)
GET    /v1/models               → list available models

Installation & Setup

What You Need to Provide

MINIMUM (TTS only):
  ELEVENLABS_API_KEY    → elevenlabs.io → Developers → API Keys
  ELEVENLABS_VOICE_ID   → created after voice cloning (see Step 3)
  3 MP3 voice samples   → /workspace/voice/samples/ (30-60s each)

FOR CALLS (add):
  TWILIO_ACCOUNT_SID    → console.twilio.com
  TWILIO_AUTH_TOKEN     → console.twilio.com
  TWILIO_PHONE_NUMBER   → buy a number on Twilio (~$1/month)

AGENT ALREADY HAS:
  TELEGRAM_BOT_TOKEN    → already in your agent .env
  TELEGRAM_CHAT_ID      → already in your agent .env

Voice Sample Requirements

Minimum : 1 file × 30 seconds
Recommended : 3 files × 1-2 minutes each
Optimal (Professional Clone) : 30+ minutes total

Quality:
  → Clear voice, no background noise
  → Natural speech rhythm
  → Consistent microphone distance
  → Format: MP3, WAV, M4A, FLAC all accepted
  → No multiple speakers in same file

ElevenLabs plan required:
  IVC (Instant Voice Clone) : Starter plan ($5/month)
  PVC (Professional, hyper-realistic) : Creator plan ($22/month)

Setup Checklist

[ ] pip install elevenlabs --break-system-packages
[ ] pip install twilio --break-system-packages (if using calls)
[ ] apt-get update && apt-get install -y ffmpeg
[ ] 3 MP3 samples uploaded to /workspace/voice/samples/
[ ] ELEVENLABS_API_KEY added to agent .env
[ ] ELEVENLABS_VOICE_ID added to agent .env (after cloning)
[ ] python3 voice_generator.py apply-config --api-key ... --voice-id ...
[ ] voice_generator.py status shows ✅ API Key and Voice ID
[ ] Test TTS successful (output MP3 plays correctly)
[ ] (Optional) Twilio credentials added for calls

Cron Schedule

# Outbound follow-up calls — Mon-Fri 10h
0 10 * * 1-5   voice-agent-pro-v3 → process /workspace/voice/calls/pending/

# Weekly VSL refresh — Sunday 11h
0 11 * * 0     voice-agent-pro-v3 → regenerate VSLs if scripts updated

# Call transcript review — Monday 9h
0 9 * * 1      python3 /workspace/voice/scripts/voice_generator.py calls --action summary

Files Written By This Skill

FileFrequencyContent
/workspace/voice/config.jsonOnce (setup)API keys, Voice ID, Agent ID
/workspace/voice/output/*.mp3Per generationGenerated audio files
/workspace/voice/calls/pending/*.jsonPer leadCalls to make
/workspace/voice/calls/history/*.jsonPer callTranscript + outcome + score
/workspace/.learnings/LEARNINGS.mdWeeklyCall patterns, best scripts
/workspace/.learnings/ERRORS.mdOn errorAPI errors, setup failures
/workspace/AUDIT.mdOn eventTTS generated, calls made, alerts

Workspace Structure

/workspace/voice/
├── config.json           ← API keys + Voice ID (from templates/config.json)
├── samples/              ← MP3 voice samples (you provide, 30-60s each)
│   ├── sample_01.mp3
│   ├── sample_02.mp3
│   └── sample_03.mp3
├── scripts/              ← Text scripts to convert to audio
│   └── [name].md
├── output/               ← Generated MP3 files
│   └── [name].mp3
├── calls/
│   ├── pending/          ← Calls to make (written by acquisition-master)
│   │   └── [lead_id].json
│   └── history/          ← Completed call transcripts
│       └── [date]-[lead_id].json
└── references/
    └── setup_guide.md    ← Full setup commands + browser dashboard guide

Constraints

❌ Never store voice samples outside /workspace/voice/samples/
❌ Never use the cloned voice to impersonate anyone other than the principal
❌ Never make calls outside working hours (9h-19h Mon-Fri)
❌ Never leave more than 1 voicemail per lead per week
❌ Never fabricate call transcripts or outcomes
✅ Always log every call with transcript to calls/history/
✅ Always notify principal when a hot lead (score 8+) calls
✅ Always respect lead's request to not be contacted again
✅ If ELEVENLABS_API_KEY missing → log to ERRORS.md, notify principal
✅ If voice samples missing → pause cloning, notify principal
✅ Consent: only clone and use the voice of the principal with explicit consent

Error Handling

ERROR: ELEVENLABS_API_KEY invalid or missing
  Action: Verify at https://elevenlabs.io/app/settings/api-keys
  Log: ERRORS.md → "API key invalid [date] — check agent .env"

ERROR: Voice samples missing
  Action: Do NOT attempt voice cloning
  Notify: via Telegram → "Upload 3 MP3 samples to /workspace/voice/samples/"
  Log: AUDIT.md → "Voice setup paused — samples missing"

ERROR: API rate limit hit (429)
  Action: Wait 60 seconds, retry once
  If still failing: queue job for next hour
  Log: ERRORS.md → "Rate limit hit — job queued [date]"

ERROR: Twilio call fails
  Action: Log failure, mark lead as call_failed in pending/
  Retry: next day same time slot
  Log: ERRORS.md → "Call failed: [lead_id] — [error] [date]"

ERROR: ffmpeg not found
  Action: apt-get update && apt-get install -y ffmpeg
  Log: ERRORS.md → "ffmpeg missing — install attempted [date]"

ERROR: ELEVENLABS_VOICE_ID not set
  Action: Run voice cloning (Step 3 in Setup section above)
  Log: AUDIT.md → "Voice ID missing — cloning required"

Comments

Loading comments...