ElevenLabs Speech-to-Text

Transcribe audio files using ElevenLabs Speech-to-Text (Scribe v2).

MIT-0 · Free to use, modify, and redistribute. No attribution required.

⭐ 5 · 3.5k · 10 current installs · 10 all-time installs

by@clawdbotborges

MIT-0

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Benign

high confidence

ℹ

Purpose & Capability

The skill's name, description, required env var (ELEVENLABS_API_KEY), and the included script all match the stated purpose of calling ElevenLabs Speech-to-Text (Scribe v2). Minor mismatch: SKILL metadata and registry requirements list only curl as a required binary, but the script also invokes jq for JSON parsing (and README lists jq as 'optional'). Because the script uses jq in normal code paths, jq is effectively required.

✓

Instruction Scope

Runtime instructions and the script stay within scope: they read a local audio file, require ELEVENLABS_API_KEY, and POST the file to the ElevenLabs API endpoint (https://api.elevenlabs.io/v1/speech-to-text). The script does not attempt to read unrelated files, other environment variables, or contact unexpected external endpoints.

✓

Install Mechanism

No install spec is provided (instruction-only with a helper script). Nothing is downloaded or written by an installer; risk from install mechanism is low.

✓

Credentials

Only ELEVENLABS_API_KEY is requested and declared as the primary credential, which is appropriate for a speech-to-text integration. No unrelated secrets or config paths are requested.

✓

Persistence & Privilege

The skill does not request permanent presence (always:false) or modify other skills or system-wide settings. Default autonomous invocation is allowed (platform default) but not combined with other concerning permissions.

Assessment

This skill appears to do exactly what it claims: it sends a local audio file to ElevenLabs' Speech-to-Text API using the ELEVENLABS_API_KEY. Before installing or running it, consider the following: (1) Provide an API key with appropriate permissions and be aware that audio will be uploaded to ElevenLabs (privacy/billing implications). (2) The script uses jq for JSON handling even though only curl is declared as required — install jq or update the script so it doesn't fail (the script uses set -euo, so missing jq will cause it to exit). (3) Review the script yourself if you have sensitive audio; running it in an isolated environment is prudent. (4) If you want stricter control, verify the exact API endpoint and header usage against ElevenLabs' official docs and limit key scope/rotation as needed.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.0.0

Download zip

latestvk97fagkq8w6zgndpq2dj8zf4c57zyx6z

License

MIT-0

Free to use, modify, and redistribute. No attribution required.

Termshttps://spdx.org/licenses/MIT-0.html

Runtime requirements

🎙️ Clawdis

Binscurl

EnvELEVENLABS_API_KEY

Primary envELEVENLABS_API_KEY

SKILL.md

ElevenLabs Speech-to-Text

Transcribe audio files using ElevenLabs' Scribe v2 model. Supports 90+ languages with speaker diarization.

Quick Start

# Basic transcription
{baseDir}/scripts/transcribe.sh /path/to/audio.mp3

# With speaker diarization
{baseDir}/scripts/transcribe.sh /path/to/audio.mp3 --diarize

# Specify language (improves accuracy)
{baseDir}/scripts/transcribe.sh /path/to/audio.mp3 --lang en

# Full JSON output with timestamps
{baseDir}/scripts/transcribe.sh /path/to/audio.mp3 --json

Options

Flag	Description
`--diarize`	Identify different speakers
`--lang CODE`	ISO language code (e.g., en, pt, es)
`--json`	Output full JSON with word timestamps
`--events`	Tag audio events (laughter, music, etc.)

Supported Formats

All major audio/video formats: mp3, m4a, wav, ogg, webm, mp4, etc.

API Key

Set ELEVENLABS_API_KEY environment variable, or configure in clawdbot.json:

{
  skills: {
    entries: {
      "elevenlabs-stt": {
        apiKey: "sk_..."
      }
    }
  }
}

Examples

# Transcribe a WhatsApp voice note
{baseDir}/scripts/transcribe.sh ~/Downloads/voice_note.ogg

# Meeting recording with multiple speakers
{baseDir}/scripts/transcribe.sh meeting.mp3 --diarize --lang en

# Get JSON for processing
{baseDir}/scripts/transcribe.sh podcast.mp3 --json > transcript.json

Files

3 total

Select a file

Select a file to preview.

Comments

Loading comments…