Douyin Transcriber

Transcribe speech from audio or video files, automatically extracting audio and converting to text using Docker Whisper ASR for Douyin/TikTok media.

Audits

Pass

ClawScanPass

Agentic behavior and permission review.

Static analysisPass

Pattern checks against bundled files.

VirusTotalPass

Multi-engine malware detections and file reputation.

Install

openclaw skills install douyin-transcriber

Douyin Transcriber

Transcribe audio/video files to text using local Docker Whisper ASR.

Quick Start

curl -X POST "http://localhost:PORT/asr" -F "audio_file=@/path/to/video.mp4"

The container has built-in ffmpeg for automatic audio extraction.

Prerequisites

Tool	Purpose	Install
Docker	Whisper ASR	Docker Desktop
ffmpeg	Audio extraction	`winget install Gyan.FFmpeg`

Deploy Whisper ASR:

docker run -d -p PORT:PORT -e ASR_MODEL=small -e ASR_ENGINE=faster_whisper --name whisper-asr onerahmet/openai-whisper-asr-webservice:latest

Workflow

Step 1: Extract Audio from Video

ffmpeg -i video.mp4 -ar 16000 -ac 1 -c:a pcm_s16le audio.wav -y

Parameters:

-ar 16000: 16kHz sample rate
-ac 1: Mono channel
-c:a pcm_s16le: 16-bit PCM

Step 2: Transcribe

curl -X POST "http://localhost:PORT/asr" -F "audio_file=@audio.wav"

Optional: specify language

curl -X POST "http://localhost:PORT/asr" -F "audio_file=@audio.wav" -F "language=zh"

Step 3: Parse Result

Response format:

{
  "text": "Transcribed content...",
  "segments": [
    {"start": 0.0, "end": 2.5, "text": "First sentence"},
    {"start": 2.5, "end": 5.0, "text": "Second sentence"}
  ],
  "language": "zh"
}

Model Selection

Model	Size	5-min video	Accuracy
tiny	75MB	~30s	Fair
base	142MB	~1min	Good
small	466MB	~3min	Better (recommended)
medium	1.5GB	~8min	Best

Change model via environment variable: -e ASR_MODEL=medium

Supported Formats

Video: mp4, mkv, avi, mov, flv, wmv, webm, m4v

Audio: wav, m4a, mp3, aac, ogg, flac, wma, opus

Troubleshooting

Issue	Solution
Docker not available	Install Docker Desktop
Container start fails	Check port availability
Transcription timeout	Use smaller model or split audio
ffmpeg not found	`winget install Gyan.FFmpeg`

Related Modules

douyin-fetcher - Video download
douyin-analyzer - Content analysis
douyin-orchestrator - Workflow coordination