YouTube Transcript (yt-dlp captions)

v1.0.5

Extract YouTube video transcripts from existing captions (manual or auto-generated) using yt-dlp, with optional timestamps and local SQLite caching. Use when...

0· 2.1k·10 current·10 all-time
bySubhadip Sarkar@itzsubhadip

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for itzsubhadip/youtube-transcript-yt-dlp.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "YouTube Transcript (yt-dlp captions)" (itzsubhadip/youtube-transcript-yt-dlp) from ClawHub.
Skill page: https://clawhub.ai/itzsubhadip/youtube-transcript-yt-dlp
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Required binaries: python3, yt-dlp
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install youtube-transcript-yt-dlp

ClawHub CLI

Package manager switcher

npx clawhub@latest install youtube-transcript-yt-dlp
Security Scan
VirusTotalVirusTotal
Suspicious
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description (YouTube transcript via yt-dlp) matches the actual requirements and behavior: the skill requires python3 and yt-dlp, validates inputs as YouTube URLs/IDs, prefers manual captions then auto captions, and caches results. No unrelated binaries, env vars, or credentials are requested.
Instruction Scope
SKILL.md instructions stay within scope: they only instruct running the provided script, optionally supplying a cookies file or env var for authenticated YouTube access, and explicitly warn against forwarding arbitrary flags or publishing cookies. The runtime instructions and the script focus on YouTube and local caching; they do not ask to read unrelated files or send data to third-party transcript providers.
Install Mechanism
There is no install spec (instruction-only with an included script). That reduces supply-chain risk. The script relies on system yt-dlp and python3 rather than fetching arbitrary remote code during install.
Credentials
No required environment variables or credentials are declared. The only optional secret is cookies (YT_TRANSCRIPT_COOKIES), which is appropriate for accessing authenticated YouTube sessions. Cookies are handled as secrets and the code filters cookie entries to YouTube/Google domains.
Persistence & Privilege
The skill is not always-enabled (always:false) and does not declare elevated privileges. It writes a local SQLite cache and supports storing cookies under ~/.config/yt-transcript/ or {baseDir}/cache/, which is reasonable for a caching/transcript tool. Note: writing cache files into {baseDir}/cache can cause artifacts inside the skill directory if the agent's baseDir is a location that gets published—SKILL.md warns about this.
Assessment
This skill appears to do what it claims: extract YouTube captions using yt-dlp and an optional YouTube transcript-panel fallback. Before installing, ensure you have yt-dlp on PATH and understand that the script will contact YouTube (network access) and write a local cache (default {baseDir}/cache/transcripts.sqlite or ~/.config/yt-transcript/). If you need authenticated access on restricted IPs, provide a cookies.txt file in Netscape format and keep it outside the skill folder (e.g., ~/.config/yt-transcript/) because cookies are secrets and can be mispublished. Finally, review and approve running subprocesses (yt-dlp) in your environment since the script invokes yt-dlp and performs HTTP requests to YouTube.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

OSLinux · macOS · Windows
Binspython3, yt-dlp
latestvk977s7zyvkpe8cz35z07fvp2bd8155qb
2.1kdownloads
0stars
6versions
Updated 6h ago
v1.0.5
MIT-0
Linux, macOS, Windows

YouTube Transcript (Captions-Only)

This skill extracts transcripts from existing YouTube captions.

Primary behavior

  • Prefer manual subtitles when available.
  • Fall back to auto-generated captions.
  • Output either:
    • JSON segments (default) or
    • plain text (--text)
  • Cache results locally in SQLite for speed.

Reliability behavior

  • If YouTube blocks anonymous access (bot-check), provide cookies.txt.
  • If yt-dlp reports no captions for a video, the script tries a fallback:
    1. YouTube’s transcript panel (youtubei get_transcript) when accessible

This published version intentionally does not call third-party transcript providers.

Privacy note: This published version only contacts YouTube directly (via yt-dlp and the transcript panel fallback). It does not send video IDs/URLs to third-party transcript providers.

Cookies: Cookies are treated as secrets.

  • The script supports --cookies / YT_TRANSCRIPT_COOKIES, but does not auto-load cookies from inside the skill directory.
  • Store cookies under ~/.config/yt-transcript/.

Path safety: This skill restricts --cookies and --cache paths to approved directories.

  • cookies allowed under: ~/.config/yt-transcript/
  • cache allowed under: {baseDir}/cache/ and ~/.config/yt-transcript/

How to run

Script path:

  • {baseDir}/scripts/yt_transcript.py

Typical usage:

  • python3 {baseDir}/scripts/yt_transcript.py <youtube_url_or_id>
  • python3 {baseDir}/scripts/yt_transcript.py <url> --lang en
  • python3 {baseDir}/scripts/yt_transcript.py <url> --text
  • python3 {baseDir}/scripts/yt_transcript.py <url> --no-ts

Cookies (optional, but often required on VPS IPs):

  • python3 {baseDir}/scripts/yt_transcript.py <url> --cookies /path/to/youtube-cookies.txt
  • or set env var: YT_TRANSCRIPT_COOKIES=/path/to/youtube-cookies.txt

Publishing safety note: Cookies are optional, so YT_TRANSCRIPT_COOKIES is intentionally not required by skill metadata. Only set it if you need authenticated access.

Best practice: store cookies outside the skill folder (so you never accidentally publish them), e.g. ~/.config/yt-transcript/youtube-cookies.txt, and point to it via --cookies or YT_TRANSCRIPT_COOKIES.

What the script returns

JSON mode (default)

A JSON object:

  • video_id: 11-char id
  • lang: chosen language
  • source: manual | auto | panel
  • segments: list of { start, duration, text } (or text-only when --no-ts)

Text mode (--text)

A newline-separated transcript.

  • By default timestamps are included as [12.34s].
  • Use --no-ts to output only the text lines.

Caching

Default cache DB:

  • {baseDir}/cache/transcripts.sqlite

Cache key includes:

  • video_id, lang, source, include_timestamp, format

Cookie handling (important)

  • Cookies must be in Netscape cookies.txt format.
  • Treat cookies as secrets.
  • Never commit / publish cookies to ClawHub.

Recommended local path (ignored by git/publish):

  • {baseDir}/cache/youtube-cookies.txt (chmod 600)

Notes (safety + reliability)

  • Only accept a YouTube URL or an 11-character video ID.
  • Do not forward arbitrary user-provided flags into the command.
  • If yt-dlp is missing, instruct the user to install it (recommended):
    • install pipx
    • pipx install yt-dlp
    • ensure yt-dlp is on PATH

Comments

Loading comments...