Install
openclaw skills install pre-recorded-transcriptionTranscribe pre-recorded audio files or URLs with Gladia. Use when the user needs batch/async transcription, speaker diarization, subtitles (SRT/VTT), PII redaction, translation, NER, summarization, chapterization, audio-to-LLM, or any audio intelligence on pre-recorded content. Always prefer the official SDK; fall back to raw REST only when SDK cannot satisfy the requirement.
openclaw skills install pre-recorded-transcriptionGladia's pre-recorded API transcribes audio and video files asynchronously.
SDK-first: always use the official SDK — see sdk-integration for policy, setup, and fallback criteria.
When NOT to use: If the user needs real-time / live transcription of a stream, microphone, or ongoing audio feed, use the live-transcription skill instead. Live transcription uses WebSocket sessions, not the pre-recorded API.
Consult these resources as needed:
get, list, getFile, delete| Endpoint | Method | SDK equivalent |
|---|---|---|
/v2/upload | POST | transcribe() auto-uploads local files |
/v2/pre-recorded | POST | create() / transcribe() |
/v2/pre-recorded | GET | list() |
/v2/pre-recorded/:id | GET | get() / poll() / transcribe() |
/v2/pre-recorded/:id | DELETE | delete() |
/v2/pre-recorded/:id/file | GET | getFile() |
The SDK transcribe() method handles upload, job creation, and polling in one call. Use this by default.
const result = await client.preRecorded().transcribe("./audio.mp3", {
language_config: { languages: ["en"] },
diarization: true,
});
console.log(result.result?.transcription?.full_transcript);
result = client.prerecorded().transcribe(
"audio.mp3",
{"language_config": {"languages": ["en"]}, "diarization": True},
)
print(result.result.transcription.full_transcript)
Audio input can be a local file path, HTTP(S) URL, social/video URL, or binary file object. For full input types, see sdk-integration.
Use raw REST only when SDK use is not possible.
POST /v2/upload with multipart form data → get audio_urlPOST /v2/pre-recorded with audio_url and config → get idGET /v2/pre-recorded/:id until status: "done" (or use webhooks/callbacks)transcription, diarization, translation, etc. from responseUse SDK methods for post-processing operations:
client.preRecorded().get(id), .list(filters), .getFile(id), .delete(id)client.prerecorded().get(id), .list(filters), .get_file(id), .delete(id)For full JS/Python examples, pagination filters, and REST equivalents, see ./references/managing-jobs.md.
All options are passed as the second argument to transcribe(). Key options:
| Option | Description |
|---|---|
language_config | Expected languages, code switching |
diarization | Speaker identification (pre-recorded only) |
translation | Translate to target languages |
summarization | Generate bullet points or paragraph summary |
subtitles | Generate SRT/VTT files |
pii_redaction | Redact PII (pre-recorded only) |
audio_to_llm | Run custom LLM prompts on transcript |
callback_url | Async webhook delivery |
For full option details, see ./references/transcription-options.md. For audio intelligence config, see audio-intelligence. For client-level retry/timeouts, see sdk-integration.
For full response JSON and event names, see ./references/delivery-and-response.md.
| Constraint | Value |
|---|---|
| Max file size | 1000 MB |
| Max duration | 135 minutes (120 min for YouTube) |
| Enterprise max duration | 4h15 |
| Concurrency (paid) | 25 concurrent jobs |
| Concurrency (free) | 3 concurrent jobs |
The SDK handles polling automatically — transcribe() polls until the job completes with configurable interval and timeout:
const result = await client.preRecorded().transcribe(audio, options, {
interval: 5000, // Poll every 5s
timeout: 600000, // Timeout after 10 minutes
});
If using raw REST instead of the SDK:
code_switching: true with empty languages triggers 100+ language evaluation. Always provide 3-5 expected languages./v2/pre-recorded/:id/file, not /v2/pre-recorded/:id/audio.For the full list of gotchas and diagnostics, see the troubleshooting skill.