Install
openclaw skills install doubao-ata-subtitleGenerate subtitles with automatic time alignment using Volcengine ATA API. Use when the user wants to: (1) add time-aligned subtitles to videos, (2) convert text + audio to SRT/ASS format, or (3) automate subtitle creation workflow.
openclaw skills install doubao-ata-subtitleGenerate subtitles with automatic time alignment using Volcengine's ATA (Automatic Time Alignment) API.
Set the following environment variables or create a config file:
export VOLC_ATA_APP_ID="your-app-id"
export VOLC_ATA_TOKEN="your-access-token"
export VOLC_ATA_API_BASE="https://openspeech.bytedance.com"
Create ~/.volcengine_ata.conf:
[credentials]
appid = your-app-id
access_token = your-access-token
secret_key = your-secret-key
[api]
base_url = https://openspeech.bytedance.com
submit_path = /api/v1/vc/ata/submit
query_path = /api/v1/vc/ata/query
A Python CLI tool is provided at ~/.openclaw/workspace/skills/volcengine-ata-subtitle/volc_ata.py.
# Basic usage: audio + text → SRT subtitle
python3 ~/.openclaw/workspace/skills/volcengine-ata-subtitle/volc_ata.py \
--audio storage/audio.wav \
--text storage/subtitle.txt \
--output storage/subtitles/final.srt
# Specify output format (srt or ass)
python3 ~/.openclaw/workspace/skills/volcengine-ata-subtitle/volc_ata.py \
--audio storage/audio.wav \
--text storage/subtitle.txt \
--output storage/subtitles/final.ass \
--format ass
pcm_s16le)Extract from video:
ffmpeg -i input.mp4 -vn -acodec pcm_s16le -ar 16000 -ac 1 audio.wav
Example:
主人闹钟没响睡过头了
我们俩轮流用鼻子拱他脸
他以为地震了抱着枕头就跑
1
00:00:00,000 --> 00:00:02,500
第一句字幕
2
00:00:02,500 --> 00:00:05,000
第二句字幕
[Script Info]
Title: ATA Subtitles
ScriptType: v4.00+
[Events]
Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
Dialogue: 0,0:00:00.00,0:00:02.50,Default,,0,0,0,,第一句字幕
Error: Invalid sample rate, expected 16000Hz
Fix:
ffmpeg -i input.mp4 -ar 16000 -ac 1 audio.wav
Error: Authorization failed
Fix: Check token format. Should be Bearer; {token} (with semicolon).