Coze Asr

Data & APIs

Automatic Speech Recognition (ASR) using Coze API. Use when you need to transcribe audio files to text. Supports Chinese audio transcription via Coze's speech-to-text API.

Install

openclaw skills install coze-asr

Coze Automatic Speech Recognition (ASR)

Transcribe audio files to text using Coze API.

Setup

1. Get your API Key: Get a key from Coze Platform

2. Set it in your environment:

export COZE_API_KEY="your-key-here"

Supported Audio Formats

  • MP3 - Recommended
  • WAV - Supported
  • OGG - Supported (包括 opus 编码)

Note: Coze API 原生支持 mp3、wav、ogg 格式,无需转换。

Usage

Basic Transcription

Transcribe an audio file:

bash scripts/speech_to_text.sh recording.mp3

Full Options

bash scripts/speech_to_text.sh <audio_file> [language]

Parameters:

  • audio_file (required): Path to audio file
  • language (optional): Language code (default: zh)

Output Format

The script outputs JSON with transcribed text.

Example output:

{
  "text": "你好,这是转录的文本内容"
}

Troubleshooting

File Size Issues:

  • Check Coze API documentation for file size limits
  • Reduce sample rate or bit depth if needed

Poor Accuracy:

  • Improve audio quality
  • Ensure clear speech and minimal noise
  • Use appropriate language code

Format Issues:

  • Ensure file is not corrupted
  • Verify audio can be played by standard players