Install
openclaw skills install douyin-transcribe-lz从抖音短视频链接中提取音频并使用Whisper转录为中文文本。 当用户提供抖音短链接(v.douyin.com/xxx)并要求 提取、转换或转录视频语音为文本时,应使用此技能。通过 Playwright视频元素捕获处理登录墙绕过。
openclaw skills install douyin-transcribe-lz从抖音短链接中提取语音并通过Whisper转换为中文文本。
首选方法——从video元素src中提取(最可靠,可绕过登录墙):
import asyncio
from playwright.async_api import async_playwright
async def get_douyin_video_url(short_url):
video_src = None
async with async_playwright() as p:
browser = await p.chromium.launch(headless=True)
context = await browser.new_context(
user_agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36"
)
page = await context.new_page()
await page.goto(short_url, wait_until="domcontentloaded", timeout=30000)
await asyncio.sleep(8) # 等待JS渲染并填充video src
video_src = await page.evaluate("""
() => {
const videos = document.querySelectorAll('video');
for (const v of videos) {
if (v.src && v.src.includes('douyin') && v.src.includes('.mp4')) return v.src;
const sources = v.querySelectorAll('source');
for (const s of sources) { if (s.src) return s.src; }
}
return null;
}
""")
await browser.close()
return video_src
为何选择此方法而非网络拦截: 即使登录模态框覆盖了视频元素,
video.src也已填充。网络拦截在登录墙下会失败。
import requests
def download_douyin_video(url, output_path, referer):
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36",
"Referer": referer,
}
resp = requests.get(url, headers=headers, stream=True, timeout=60)
with open(output_path, "wb") as f:
for chunk in resp.iter_content(chunk_size=1024*1024):
if chunk:
f.write(chunk)
return output_path
referer = "https://www.douyin.com/" 或解析后的视频页面URLv26-web.douyinvod.com——这些URL有效时间为数小时import whisper, os, imageio_ffmpeg
# 确保ffmpeg在PATH中
ffmpeg_exe = imageio_ffmpeg.get_ffmpeg_exe()
ffmpeg_dir = os.path.dirname(ffmpeg_exe)
os.environ["PATH"] = ffmpeg_dir + os.pathsep + os.environ.get("PATH", "")
import shutil
shutil.copy(ffmpeg_exe, os.path.join(ffmpeg_dir, "ffmpeg.exe")) # 确保可访问
model = whisper.load_model("medium") # ~1.4GB,首次运行后缓存
result = model.transcribe(video_path, language="zh", verbose=True, task="transcribe")
保存输出:
transcript.txt — 完整文本 + 带时间戳的片段(供用户阅读)transcript.json — Whisper原始输出(供程序使用)安装一次:
pip install playwright openai-whisper imageio[ffmpeg] requests
playwright install chromium
video.src由JS在登录模态框出现前填充。网络拦截会遗漏,直接DOM查询可捕获。v26-web.douyinvod.com/...?a=...)有效约24小时。捕获后立即下载。medium;base更快但准确度较低。首次运行下载模型(~1.4GB),之后缓存。imageio[ffmpeg]自动提供——通过imageio_ffmpeg.get_ffmpeg_exe()获取路径。fetch_douyin_video.py脚本进行替代URL提取。scripts/fetch_douyin_video.py — 完整端到端脚本(捕获 → 下载 → 转录)references/whisper_usage.md — Whisper API选项和中文语言提示