Install
openclaw skills install gladia-troubleshootingDiagnose and fix common Gladia API issues. Use when the user encounters errors (401, 403, 429), unexpected behavior, poor transcription quality, billing confusion, audio format problems, WebSocket disconnections, polling failures, or asks about limits and rate limiting. SDK-first diagnostics — many issues are solved by migrating to the official SDK.
openclaw skills install gladia-troubleshootingCommon issues, gotchas, and their solutions when working with the Gladia API.
SDK-first diagnostics: first verify the user is on the official SDK — many issues (polling, reconnection, retries) are solved automatically. See gladia-sdk-integration for setup and policy.
When NOT to use: For initial SDK setup and configuration, use gladia-sdk-integration. For feature-specific guidance (options, parameters, response structure), use gladia-pre-recorded-transcription or gladia-live-transcription.
Consult these resources as needed:
apiKey (JS) / api_key (Python) constructor option, or set the GLADIA_API_KEY environment variablex-gladia-key headerConcurrency limits by plan:
| Plan | Pre-recorded | Live | Notes |
|---|---|---|---|
| Free | 3 | 1 | 10 hrs/month total |
| Starter | 25 | 30 | — |
| Growth | 25 | 30 | — |
| Enterprise | Unlimited | Unlimited | — |
Fix: Wait for in-progress jobs to complete before starting new ones, or upgrade your plan.
Problem: Enabling code_switching: true with an empty languages array causes evaluation across 100+ languages and frequent misdetections.
Fix: Always provide 3-5 expected languages:
{
"language_config": {
"languages": ["en", "fr", "es"],
"code_switching": true
}
}
Problem: intensity values above 0.6 cause false positives where unrelated words get replaced by vocabulary entries.
Fix: Keep intensity at 0.4-0.6. Use pronunciations for better recognition instead of raising intensity:
{
"vocabulary": [
{ "value": "Gladia", "pronunciations": ["gla-dee-ah"], "intensity": 0.5 }
]
}
Problem: Pre-recorded files over 135 minutes may fail without a clear error message.
Fix: Split long audio into chunks of ~60 minutes before uploading. For enterprise (4h15 limit), contact support.
Problem: Sending 2-channel (stereo) audio is billed as 2x the duration.
Fix: Merge to mono if you don't need per-channel speaker identification. Only use multi-channel intentionally for distinct audio sources.
Problem: If the WebSocket drops, creating a new session loses context.
Fix (SDK — recommended): The SDK handles reconnection automatically with configurable wsRetry. No action needed if using the SDK.
Fix (raw WebSocket): Reconnect to the same WebSocket URL to resume the session. Do NOT call /v2/live again.
Problem: Rapidly polling /v2/pre-recorded/:id wastes requests and may trigger rate limits.
Fix (SDK — recommended): The SDK handles polling automatically. Use transcribe() which includes built-in backoff, or configure poll() directly:
const result = await client.preRecorded().transcribe(audio, options, {
interval: 5000, // 5 seconds between polls
});
result = client.prerecorded().transcribe(
"audio.mp3",
options,
{"interval": 5000}, # 5 seconds between polls
)
Fix (raw REST): Implement exponential backoff (start at 3s, max 30s), or use webhooks/callbacks instead.
Problem: Leaving a WebSocket open without sending stop_recording keeps the session hanging until the 3-hour timeout.
Fix: Always explicitly call session.stopRecording() (or session.stop_recording() in Python) when done. Implement cleanup in error handlers.
Problem: Real-time results come only as final transcripts by default.
Fix: Enable partial transcripts in session config:
{
"messages_config": {
"receive_partial_transcripts": true
}
}
Problem: Speaker diarization is only available for pre-recorded transcription.
Fix: For live multi-speaker scenarios, use multi-channel audio with one speaker per channel and track by channel number.
Problem: pii_redaction: true is silently ignored in live transcription.
Fix: PII redaction only works for pre-recorded. For live compliance needs, implement client-side redaction on the transcript text.
encoding, sample_rate, bit_depth, channels match your actual audio streampre_processing.audio_enhancer: true (live)languages listcode_switching with 3-5 expected languagescallback_url is publicly reachable (not localhost)Webhooks are powered by Svix. Verify using the Svix libraries:
import { Webhook } from "svix";
const wh = new Webhook(webhookSecret);
wh.verify(payload, headers);
Before submitting transcription work:
GLADIA_API_KEY env var)languages list has 3-5 entriesstopRecording() / stop_recording()status, error_message)