Install
openclaw skills install videolink-to-articleUse when the user provides a Bilibili or YouTube URL and asks for a transcript, article, or subtitle extraction. Downloads platform subtitles via BBDown / yt-dlp and produces a clean Markdown transcript in the source language.
openclaw skills install videolink-to-articleExtract subtitles from a Bilibili or YouTube video URL and process them into a clean, structured Markdown transcript. Handles tool installation, subtitle download, metadata extraction, and a deterministic-then-interpretive cleanup pipeline that preserves the speaker's original wording.
Trigger on Bilibili (bilibili.com/video/BV..., b23.tv/...) or YouTube (youtube.com/watch?v=..., youtu.be/...) URLs combined with requests for transcripts, articles, subtitle extraction, or restructuring spoken content into Markdown. Do not trigger for: download-only requests, audio-only ASR (no subtitles available; use a Whisper-based skill), or non-{Bilibili,YouTube} platforms.
| URL pattern | Tool |
|---|---|
bilibili.com/video/BV..., b23.tv/... | BBDown |
youtube.com/watch?v=..., youtu.be/... | yt-dlp |
| Other | Decline |
| Placeholder | Purpose | Lifetime |
|---|---|---|
<TOOLS_DIR> | Persistent binary cache (BBDown, yt-dlp, helper scripts). | Cross-session. |
<WORK_DIR> | Per-task outputs (SRT, metadata.json, transcript). | Single task. |
Choosing <TOOLS_DIR> (apply in order, stop at first match):
binaries/, bin/, tools/ under per-user data) as <that-dir>/videolink-to-article/.~/bin/videolink-to-article/; macOS: ~/Library/Application Support/videolink-to-article/bin/; Linux: ~/.local/bin/videolink-to-article/.<TOOLS_DIR> on the user's persistent PATH (see references/install_tools.md § Persisting tools to PATH). Without this, every new session re-runs the install probe.<WORK_DIR> is per-task. Reasonable picks: workspace-relative output/ or transcripts/, or a task-named subfolder under temp. Safe to clean up after delivery.
For Bilibili check BBDown.exe; for YouTube check yt-dlp.exe. Probe order: system PATH → <TOOLS_DIR>/<exe> → any path the user has explicitly mentioned. If missing, install per references/install_tools.md (use the standalone yt-dlp.exe, not a venv shim) and end the install with PATH registration.
BBDown does a startup check for ffmpeg. If ffmpeg is missing, pass --ffmpeg-path "<any-existing-exe>" (e.g. <TOOLS_DIR>/yt-dlp.exe). Subtitle-only mode never actually invokes ffmpeg, but BBDown refuses to start without the path resolving.
YouTube from Mainland China usually requires --proxy http://127.0.0.1:<port>.
python scripts/fetch_metadata.py "<URL>" --bbdown "<bbdown.exe>" --ytdlp "<yt-dlp.exe>" --ffmpeg-path "<any-existing-exe>"
Save stdout JSON to <WORK_DIR>/<video_id>/metadata.json. Fields: platform, video_id, title, uploader, uploader_url, duration_seconds, publish_date, url. Notes:
uploader may be null (BBDown's --only-show-info doesn't print UP name); extract from title or ask the user.3 = title parse failed → upstream tool's output format changed.Use <video_id> to name the per-task subdirectory under <WORK_DIR>.
Always list before downloading.
Bilibili:
BBDown.exe "<URL>" --only-show-info --sub-only --skip-ai false `
--ffmpeg-path "<any-exe>" *> bbdown_info.log 2>&1
Get-Content bbdown_info.log -Encoding UTF8
⚠️
--skip-ai falseis required every time — BBDown's default skips AI subtitles silently. The double-negative is intentional; verify the argument is present.
If the log shows "需要登录", run BBDown.exe login first (see references/auth.md).
YouTube:
yt-dlp --list-subs --skip-download "<URL>"
If the command fails with an authentication error (e.g. "Sign in to confirm you're not a bot", "Sign in to confirm your age", HTTP 403), follow this auth resolution flow in order, stopping at the first success:
Try --cookies-from-browser (fastest if it works):
yt-dlp --cookies-from-browser edge --list-subs --skip-download "<URL>"
If this fails with Could not copy Chrome cookie database → close all browser instances and retry. If it fails with Failed to decrypt with DPAPI (App-Bound Encryption on Chromium ≥ v127), skip to step 2 — this error is unrecoverable.
Ask the user for a cookies.txt file. This is the most reliable method. Present the following guidance to the user:
YouTube 需要登录态才能获取字幕,但自动读取浏览器 Cookie 失败了。
请按以下步骤导出
cookies.txt文件:
- 安装 Chrome 扩展 Get cookies.txt LOCALLY(开源、本地运行、无遥测)
- 在浏览器中打开目标 YouTube 视频页面(确保已登录)
- 点击扩展图标 → Export,将
cookies.txt保存到本地- 告诉我 cookies.txt 文件的路径
Once the user provides the file path, pass it to yt-dlp:
yt-dlp --cookies "<cookies.txt_path>" --list-subs --skip-download "<URL>"
Save the cookies path for reuse in Step 4.
Three caption kinds, ranked by quality:
| Entry | Meaning | Quality |
|---|---|---|
Available subtitles | Human-uploaded | Highest |
Available automatic captions → <lang>-orig | Source-language ASR | High (no translation step) |
Available automatic captions → <lang> | Auto-translation of source ASR | Lower (machine translation) |
Picking rule. Prefer human captions; otherwise prefer <lang>-orig; only fall back to translated auto-captions when neither exists. Trap: if you see e.g. zh-Hans and en-orig but no zh-Hans-orig, the speaker is not speaking Chinese — zh-Hans is a translation of the English ASR. Always download <lang>-orig and work from it; do not pre-emptively prefer a translated track because it matches the user's interface language.
Layout: <WORK_DIR>/<video_id>/{metadata.json, *.srt, sentences.txt}.
Bilibili:
BBDown.exe "<URL>" --sub-only --skip-ai false `
--work-dir "<WORK_DIR>/<video_id>" --ffmpeg-path "<any-exe>"
Output: <videoTitle>.<lang>.srt (e.g. *.ai-zh.srt, *.ai-en.srt).
YouTube:
yt-dlp --write-subs --write-auto-subs \
--sub-langs "<source-lang>-orig,<source-lang>" \
--convert-subs srt --skip-download --ignore-no-formats-error \
-o "<WORK_DIR>/<video_id>/%(id)s.%(ext)s" "<URL>"
If Step 3 resolved auth via --cookies-from-browser, add the same flag here. If a cookies.txt file was provided, add --cookies "<cookies.txt_path>". Example with cookies:
yt-dlp --cookies "<cookies.txt_path>" \
--write-subs --write-auto-subs \
--sub-langs "<source-lang>-orig,<source-lang>" \
--convert-subs srt --skip-download --ignore-no-formats-error \
-o "<WORK_DIR>/<video_id>/%(id)s.%(ext)s" "<URL>"
Replace <source-lang> with the actual language code from Step 3 — do not hardcode a default. The -o template uses %(id)s (not %(title)s) on purpose: titles vary in punctuation and special characters, and yt-dlp's title sanitization differs from BBDown's; using id keeps file naming uniform across both platforms, and the human-readable title prefix is added later by Step 8 (rename_with_title.py). --ignore-no-formats-error is required: without it, yt-dlp aborts on Requested format is not available before writing queued subtitle files (common on bot-flagged sessions). Add --proxy http://127.0.0.1:<port> when blocked.
Fallback for srv3. If yt-dlp writes *.srv3 instead of *.srt (its converter pipeline can't always handle srv3 without ffmpeg), run python scripts/srv3_to_text.py <input.srv3> <out_basename> to produce .srt and .txt directly. Stdlib-only.
For auth-required videos (member-only, age-restricted, "Sign in to confirm you're not a bot"), see references/auth.md.
Get-Content "<file>.srt" -Encoding UTF8 -TotalCount 30
| Symptom | Cause | Action |
|---|---|---|
| 0-byte file | Lang not actually available | Retry Step 3/4 with another lang code |
Timestamps all 00:00:00,000 | Corrupt subtitle | Re-download |
| Garbled / mojibake | Wrong encoding | Re-read with UTF-8 |
Body is [音乐] / [Music] only | No real captions | Stop; tell the user |
| Segment count ≪ video length | Truncated | Retry |
python scripts/srt_to_sentences.py <input.srt> <WORK_DIR>/<video_id>/sentences.txt [--target-len 60] [--max-len 120]
Defaults work for typical conversational video. Dense expository → raise --max-len (e.g. 180). Staccato delivery → lower --target-len (e.g. 40). Exit 2 = empty / no parseable segments → SRT is invalid; revisit Step 5.
The transcript stays in the source language of the video. English video → English transcript; Japanese → Japanese; Mandarin → Mandarin. Do not silently translate to the user's interface language, the agent's response language, or any other locale.
Translate only when the user explicitly asks ("整理成中文文稿", "translate to English"). Even then, produce the source-language version first as the primary deliverable; offer translation as an additional output.
This rule overrides any default-language hint injected into the agent's runtime context.
references/cleanup_guide.md.## / ### only at obvious topic transitions the speaker explicitly signals ("第一个是…", "Let's talk about…", "Moving on to…"). Headings stay in the source language, short, descriptive. No structural signals → no headings. Output continuous paragraphs.metadata.json from Step 2 to populate the header. Header label set follows the source language (Chinese labels for Chinese videos, English for English, etc.). Templates: references/cleanup_guide.md § Output Skeleton Template.[^1]: 出自...)[?...] placeholdersWhen a fragment cannot be cleaned with confidence, trim it rather than guess or annotate. Full rule list: references/cleanup_guide.md § What NOT To Do.
After Step 7 has written transcript.md (and any transcript_*.md translations), run:
python scripts/rename_with_title.py <WORK_DIR>/<video_id>
This renames the final transcript file(s) to include the sanitized video title as a prefix, while keeping <video_id> as a stable suffix. Result format: <sanitized_title>__<video_id>.md. The script is idempotent — safe to re-run.
Cleanup intermediate files. After renaming, delete all intermediate artifacts (.srt, .srv3, sentences.txt, .info.json, metadata.json, etc.) — only the final renamed transcript(s) should remain in the output directory. The agent should deliver only the final transcript file to the user.
# Example cleanup (Windows PowerShell) — keep only the renamed transcript(s)
Get-ChildItem "<WORK_DIR>/<video_id>" -File | Where-Object {
$_.Name -notlike "*__*.md"
} | Remove-Item -Force
Why rename only the transcript. Since intermediate files (SRT, sentences.txt, metadata.json) are cleaned up and not delivered, the rename logic only needs to handle the final transcript.md and any translation variants (transcript_*.md). This keeps the output directory clean with just the deliverable(s).
references/troubleshooting.md is a recovery matrix organized by failure category (tools & environment, network & download, authentication, subtitle availability, encoding & display). Always retry once before reporting transient network issues.
scripts/
fetch_metadata.py — unified Bilibili/YouTube metadata fetcher (JSON to stdout). Exit 1 (URL invalid) / 2 (tool missing) / 3 (parse failed).srt_to_sentences.py — deterministic SRT preprocessor; merges short segments into sentence-like lines. Exit 2 on empty/invalid input.srv3_to_text.py — fallback converter when yt-dlp's --convert-subs srt leaves you with raw srv3 XML. Stdlib-only. Exit 2 on empty input.rename_with_title.py — Step 8 helper: renames the final transcript file(s) with a sanitized video title prefix. Stdlib-only, idempotent. Exit 2 (metadata.json missing) / 3 (metadata.json missing fields).references/
install_tools.md — install procedures for BBDown / yt-dlp (Windows + POSIX), GitHub mirrors, ffmpeg workaround, PATH registration.auth.md — authentication flows: BBDown QR-code login, yt-dlp --cookies-from-browser, manual cookies.txt export. App-Bound Encryption pitfalls.cleanup_guide.md — methodology for Step 7: ASR error identification, Words to KEEP vs filler, sectioning heuristics, output skeleton.worked_example.md — one full walkthrough on a real video. Read once when learning; skip when you just need rules.troubleshooting.md — recovery matrix.