Elevenlabs Transcribe

Transcribe audio to text using ElevenLabs Scribe. Supports batch transcription, realtime streaming from URLs, microphone input, and local files.

MIT-0 · Free to use, modify, and redistribute. No attribution required.
2 · 2.2k · 2 current installs · 3 all-time installs
MIT-0
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
The scripts and declared requirements (ffmpeg, python3, ELEVENLABS_API_KEY) align with a speech-to-text skill using ElevenLabs. However, SKILL.md calls this the 'Official ElevenLabs skill' while the registry 'Source' is unknown and the owner ID does not obviously belong to ElevenLabs — possible impersonation or mislabeling.
Instruction Scope
The runtime instructions and scripts stay within the stated purpose: convert audio (file, mic, URL) to text and send audio to ElevenLabs via their SDK. One minor scope note: the Python code calls load_dotenv(), which will read a local .env file if present — that can surface other environment variables from disk (not declared in requires.env).
Install Mechanism
There is no platform install spec, but the provided shell wrapper auto-creates a local virtualenv and runs pip install -r requirements.txt. Main dependencies are pinned with hashes for supply-chain integrity (elevenlabs, pydub, python-dotenv), but some platform-specific packages (sounddevice, numpy) are not hashed. pip installs from PyPI on first run (network activity) and writes a .venv directory under the skill folder.
Credentials
Only ELEVENLABS_API_KEY is declared and used; that is appropriate for a transcription client. Note that load_dotenv() may read a .env file from disk and load additional env vars implicitly. The code will transmit audio and the API key (via the ElevenLabs SDK) to ElevenLabs' service — this is expected behavior but worth confirming you're comfortable sending audio to that provider.
Persistence & Privilege
The skill does not request always:true and won't be force-included. It sets up a per-skill .venv and an installed marker in the skill directory; it doesn't modify other skills or system-wide agent settings.
What to consider before installing
This skill's code behaves like a normal ElevenLabs transcription client, but before installing: 1) Confirm the publisher — SKILL.md claims 'Official ElevenLabs' but the source/owner are not ElevenLabs; prefer official plugins from the vendor when possible. 2) Review and protect your ELEVENLABS_API_KEY (use a scoped/test key if possible). 3) Be aware the script will create a local .venv and pip-install packages from PyPI (network activity); consider installing in an isolated environment/container. 4) Note load_dotenv() will read a .env file in the skill directory and could load other env vars — remove secrets you don't want read. 5) If you need stronger supply-chain guarantees, request that all requirements be pinned with verified hashes for every platform-specific package or run the code review/install inside an isolated sandbox first.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.0.1
Download zip
latestvk974g0tvkwspjd8zrrxg94dqpn80exh8

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Runtime requirements

🎙️ Clawdis
Binsffmpeg, python3
EnvELEVENLABS_API_KEY
Primary envELEVENLABS_API_KEY

SKILL.md

ElevenLabs Speech-to-Text

Official ElevenLabs skill for speech-to-text transcription.

Convert audio to text with state-of-the-art accuracy. Supports 90+ languages, speaker diarization, and realtime streaming.

Prerequisites

  • ffmpeg installed (brew install ffmpeg on macOS)
  • ELEVENLABS_API_KEY environment variable set
  • Python 3.8+ (dependencies auto-install on first run)

Usage

{baseDir}/scripts/transcribe.sh <audio_file> [options]
{baseDir}/scripts/transcribe.sh --url <stream_url> [options]
{baseDir}/scripts/transcribe.sh --mic [options]

Examples

Batch Transcription

Transcribe a local audio file:

{baseDir}/scripts/transcribe.sh recording.mp3

With speaker identification:

{baseDir}/scripts/transcribe.sh meeting.mp3 --diarize

Get full JSON response with timestamps:

{baseDir}/scripts/transcribe.sh interview.wav --diarize --json

Realtime Streaming

Stream from a URL (e.g., live radio, podcast):

{baseDir}/scripts/transcribe.sh --url https://npr-ice.streamguys1.com/live.mp3

Transcribe from microphone:

{baseDir}/scripts/transcribe.sh --mic

Stream a local file in realtime (useful for testing):

{baseDir}/scripts/transcribe.sh audio.mp3 --realtime

Quiet Mode for Agents

Suppress status messages on stderr:

{baseDir}/scripts/transcribe.sh --mic --quiet

Options

OptionDescription
--diarizeIdentify different speakers in the audio
--lang CODEISO language hint (e.g., en, pt, es, fr)
--jsonOutput full JSON with timestamps and metadata
--eventsTag audio events (laughter, music, applause)
--realtimeStream local file instead of batch processing
--partialsShow interim transcripts during realtime mode
-q, --quietSuppress status messages (recommended for agents)

Output Format

Text Mode (default)

Plain text transcription:

The quick brown fox jumps over the lazy dog.

JSON Mode (--json)

{
  "text": "The quick brown fox jumps over the lazy dog.",
  "language_code": "eng",
  "language_probability": 0.98,
  "words": [
    {"text": "The", "start": 0.0, "end": 0.15, "type": "word", "speaker_id": "speaker_0"}
  ]
}

Realtime Mode

Final transcripts print as they're committed. With --partials:

[partial] The quick
[partial] The quick brown fox
The quick brown fox jumps over the lazy dog.

Supported Formats

Audio: MP3, WAV, M4A, FLAC, OGG, WebM, AAC, AIFF, Opus Video: MP4, AVI, MKV, MOV, WMV, FLV, WebM, MPEG, 3GPP

Limits: Up to 3GB file size, 10 hours duration

Error Handling

The script exits with non-zero status on errors:

  • Missing API key: Set ELEVENLABS_API_KEY environment variable
  • File not found: Check the file path exists
  • Missing ffmpeg: Install with your package manager
  • API errors: Check API key validity and rate limits

When to Use Each Mode

ScenarioCommand
Transcribe a recording./transcribe.sh file.mp3
Meeting with multiple speakers./transcribe.sh meeting.mp3 --diarize
Live radio/podcast stream./transcribe.sh --url <url>
Voice input from user./transcribe.sh --mic --quiet
Need word timestamps./transcribe.sh file.mp3 --json

Files

4 total
Select a file
Select a file to preview.

Comments

Loading comments…