Install
openclaw skills install aliyun-qwen-asrUse when transcribing non-realtime speech with Alibaba Cloud Model Studio Qwen ASR models (`qwen3-asr-flash`, `qwen-audio-asr`, `qwen3-asr-flash-filetrans`). Use when converting recorded audio files to text, generating transcripts with timestamps, or documenting DashScope/OpenAI-compatible ASR request and response fields.
openclaw skills install aliyun-qwen-asrCategory: provider
mkdir -p output/aliyun-qwen-asr
python -m py_compile skills/ai/audio/aliyun-qwen-asr/scripts/transcribe_audio.py && echo "py_compile_ok" > output/aliyun-qwen-asr/validate.txt
Pass criteria: command exits 0 and output/aliyun-qwen-asr/validate.txt is generated.
output/aliyun-qwen-asr/.Use Qwen ASR for recorded audio transcription (non-realtime), including short audio sync calls and long audio async jobs.
Use one of these exact model strings:
qwen3-asr-flashqwen3-asr-flash-2026-02-10qwen-audio-asrqwen3-asr-flash-filetransqwen3-asr-flash-filetrans-2025-11-17Selection guidance:
qwen3-asr-flash, qwen3-asr-flash-2026-02-10, or qwen-audio-asr for short/normal recordings (sync).qwen3-asr-flash-filetrans or qwen3-asr-flash-filetrans-2025-11-17 for long-file transcription (async task workflow).python3 -m venv .venv
. .venv/bin/activate
DASHSCOPE_API_KEY in environment, or add dashscope_api_key to ~/.alibabacloud/credentials.audio (string, required): public URL or local file path.model (string, optional): default qwen3-asr-flash.language_hints (array<string>, optional): e.g. zh, en.sample_rate (number, optional)vocabulary_id (string, optional)disfluency_removal_enabled (bool, optional)timestamp_granularities (array<string>, optional): e.g. sentence.async (bool, optional): default false for sync models, true for qwen3-asr-flash-filetrans.text (string): normalized transcript text.task_id (string, optional): present for async submission.status (string): SUCCEEDED or submission status.raw (object): original API response.Sync transcription (OpenAI-compatible protocol):
curl -sS --location 'https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"model": "qwen3-asr-flash",
"messages": [
{
"role": "user",
"content": [
{
"type": "input_audio",
"input_audio": {
"data": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"
}
}
]
}
],
"stream": false,
"asr_options": {
"enable_itn": false
}
}'
Async long-file transcription (DashScope protocol):
curl -sS --location 'https://dashscope.aliyuncs.com/api/v1/services/audio/asr/transcription' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'X-DashScope-Async: enable' \
--header 'Content-Type: application/json' \
--data '{
"model": "qwen3-asr-flash-filetrans",
"input": {
"file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"
}
}'
Poll task result:
curl -sS --location "https://dashscope.aliyuncs.com/api/v1/tasks/<task_id>" \
--header "Authorization: Bearer $DASHSCOPE_API_KEY"
Use the bundled script for URL/local-file input and optional async polling:
python skills/ai/audio/aliyun-qwen-asr/scripts/transcribe_audio.py \
--audio "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" \
--model qwen3-asr-flash \
--language-hints zh,en \
--print-response
Long-file mode:
python skills/ai/audio/aliyun-qwen-asr/scripts/transcribe_audio.py \
--audio "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" \
--model qwen3-asr-flash-filetrans \
--async \
--wait
input_audio.data (data URI) when direct URL is unavailable.language_hints minimal to reduce recognition ambiguity.output/aliyun-qwen-asr/transcripts/.output/aliyun-qwen-asr/transcripts/OUTPUT_DIR.references/api_reference.mdreferences/sources.mdskills/ai/audio/aliyun-qwen-tts-realtime/.