Speech is Cheap Transcribe

Fast, affordable automatic speech-to-text transcription supporting 100 languages, speaker diarization, word timestamps, and customizable output formats.

Audits

Pass

Install

openclaw skills install asr

Speech is Cheap (SIC) Skill

Fast, accurate, and incredibly inexpensive automatic speech-to-text transcription service.

🚀 Why use this skill?

  • Disruptive Pricing: $0.06 - $0.12 per hour (2-15x cheaper than Deepgram or OpenAI).
  • Extreme Speed: 100 minutes of audio transcribes in ~1 minute.
  • Multilingual: Supports 100 languages with auto-detection.
  • Agent-Ready: Designed for high-volume, automated pipelines.

🛠 Setup

1. Get an API Key

Sign up at speechischeap.com. Use code CH5 for $5 off.

2. Configure Authentication

This skill looks for your API key in the SIC_API_KEY environment variable.

Add this to your .env or agent config:

SIC_API_KEY=your_key_here

📖 Usage

🤖 TL;DR for Agents

When this skill is installed, you can transcribe any URL from an OpenClaw session and get the JSON results immediately by running: ./skills/asr/scripts/asr.sh transcribe --url "https://example.com/audio.mp3"

Transcribe a URL

# Basic transcription
./skills/asr/scripts/asr.sh transcribe --url "https://example.com/audio.mp3"

# Advanced transcription with options
./skills/asr/scripts/asr.sh transcribe --url "https://example.com/audio.mp3" \
  --speakers --words --labels \
  --language "en" \
  --format "srt" \
  --private

Transcribe a Local File

Perfect for processing audio already on your disk. This handles the upload automatically.

# Upload and transcribe local media
./skills/asr/scripts/asr.sh transcribe --file "./local-audio.wav"

# Upload with webhook callback
./skills/asr/scripts/asr.sh transcribe --file "./local-audio.wav" --webhook "https://mysite.com/callback"

# Note: For local files, the skill handles the multi-part upload to
# https://upload.speechischeap.com before starting the transcription.

Supported Options

  • --speakers: Enable speaker diarization
  • --words: Enable word-level timestamps
  • --labels: Enable audio labeling (music, noise, etc.)
  • --stream: Enable streaming output
  • --private: Do not store audio/transcript (privacy mode)
  • --language <code>: ISO language code (e.g., 'en', 'es')
  • --confidence <float>: Minimum confidence threshold (default 0.5)
  • --format <fmt>: Output format (json, srt, vtt, webvtt)
  • --webhook <url>: URL to receive job completion payload
  • --segment-duration <n>: Segment duration in seconds (default 30)

Check Job Status

./skills/asr/scripts/asr.sh status "job-id-here"

🤖 For Agents

The asr.sh command-line tool returns JSON by default when successful, making it easy to pipe into other tools or parse directly.

If the SIC_API_KEY is missing, the tool will provide a clear error message and a direct link to the signup page.