Install
openclaw skills install video-to-text-2Video to text converter. Downloads videos from Bilibili using bilibili-api, from other sites using yt-dlp, then transcribes audio using faster-whisper. Use w...
openclaw skills install video-to-text-2Convert video URLs or local files to text transcripts.
python3 scripts/video_to_text.py <video_url_or_local_file> [options]
| Argument | Description | Default |
|---|---|---|
| url | Video URL or local file path (required) | - |
| -m, --model | Whisper model size | base |
| -l, --language | Specify language code | Auto-detect |
| -o, --output | Output file path | Print to terminal |
| --keep-files | Keep downloaded audio/video files | No |
| --sessdata | Bilibili SESSDATA | From config |
| --bili-jct | Bilibili bili_jct | From config |
| --buvid3 | Bilibili buvid3 | From config |
| Model | Size | Speed | Accuracy |
|---|---|---|---|
| tiny | ~75MB | Fastest | Lowest |
| base | ~150MB | Fast | Basic |
| small | ~500MB | Medium | Good |
| medium | ~1.5GB | Slow | Very Good |
| large | ~3GB | Slowest | Best |
# Bilibili video (requires auth)
python3 scripts/video_to_text.py "https://www.bilibili.com/video/BVxxx"
# Specify Chinese language
python3 scripts/video_to_text.py "https://www.bilibili.com/video/BVxxx" -l zh
# Local file
python3 scripts/video_to_text.py "/path/to/video.mp4" -m small
# Save to file
python3 scripts/video_to_text.py "https://www.bilibili.com/video/BVxxx" -o result.txt
Edit BILIBILI_CREDENTIALS dict in the script:
BILIBILI_CREDENTIALS = {
"sessdata": "your_sessdata",
"bili_jct": "your_bili_jct",
"buvid3": "your_buvid3"
}
python3 scripts/video_to_text.py "https://www.bilibili.com/video/BVxxx" \
--sessdata "xxx" \
--bili-jct "xxx" \
--buvid3 "xxx"
WARNING: These are your login credentials. Don't share with others!
# Install dependencies
pip3 install bilibili-api-python yt-dlp faster-whisper aiohttp requests
# Ensure ffmpeg is installed
# Ubuntu/Debian: sudo apt install ffmpeg
# CentOS: sudo yum install ffmpeg