Install
openclaw skills install youtube-video-transcriptFetch, summarize, and save YouTube transcripts with timestamp navigation, chapter detection, and searchable content.
openclaw skills install youtube-video-transcriptMost YouTube transcript tools either require paid APIs, use suspicious proxies, or just dump raw text without structure. This skill extracts transcripts locally using yt-dlp, preserves timestamps for navigation, detects chapters automatically, and exports to any format you need.
User shares a YouTube link and wants to read instead of watch. User asks what someone says about a topic at a specific moment. User needs to extract quotes with timestamps for research or content creation. User wants to summarize a video or search within its content.
┌──────────────────────────────────────────────┐
│ YOUTUBE TRANSCRIPT FLOW │
└──────────────────────────────────────────────┘
│
┌────────────────────┼────────────────────┐
▼ ▼ ▼
┌─────────┐ ┌──────────┐ ┌─────────┐
│ VIDEO │ │ METADATA │ │SUBTITLES│
│ URL │ │ FETCH │ │ CHECK │
└────┬────┘ └────┬─────┘ └────┬────┘
│ │ │
│ youtube.com/ │ Title, duration, │ Manual first,
│ watch?v=... │ chapters, lang │ auto fallback
│ │ │
└───────────────────┴────────────────────┘
│
▼
┌─────────────────┐
│ EXTRACT + CLEAN │
│ VTT → Markdown │
│ with timestamps │
└────────┬────────┘
│
┌───────────────┼───────────────┐
▼ ▼ ▼
┌──────────┐ ┌───────────┐ ┌──────────┐
│ CHAPTERS │ │ SEARCH │ │ EXPORT │
│ detected │ │ by topic │ │ MD/SRT/ │
│ or smart │ │ timestamp │ │ TXT/JSON │
└──────────┘ └───────────┘ └──────────┘
Always fetch video info before extracting subtitles:
yt-dlp -j "VIDEO_URL"
This gives you title, duration, official chapters, and available languages. Use it to confirm the right video and check what subtitles exist.
Manual (uploaded) subtitles are higher quality than auto-generated:
# Try manual first
yt-dlp --write-sub --sub-lang en --skip-download "VIDEO_URL"
# Fall back to auto-generated if manual unavailable
yt-dlp --write-auto-sub --sub-lang en --skip-download "VIDEO_URL"
Auto-generated transcripts often have errors, missing punctuation, and wrong word boundaries. Manual subtitles are human-verified.
Every segment must include timestamps. Format: [HH:MM:SS] or [MM:SS] for videos under 1 hour.
Why this matters: Users need to jump to specific moments. "Take me to where they discuss pricing" requires knowing the timestamp.
Output format:
[00:00] Welcome to this video about machine learning
[00:15] Today we'll cover three main topics
[00:30] First, let's talk about neural networks
Many videos have chapter markers embedded. Extract from metadata:
yt-dlp -j "VIDEO_URL" | jq '.chapters'
When video lacks chapters, detect natural breaks from transcript:
When user asks "where do they talk about X":
Response format:
Found 3 mentions of "machine learning":
[05:23] "...this is where machine learning really shines..."
Context: Discussing data processing approaches
[12:45] "...traditional methods vs machine learning..."
Context: Comparison section
Generate clickable links: https://youtube.com/watch?v=VIDEO_ID&t=323
Memory lives in ~/youtube-video-transcript/. See memory-template.md for structure.
~/youtube-video-transcript/
├── memory.md # Preferences + recent videos
├── videos/ # Cached transcripts (with consent)
│ └── {video_id}.md # Individual video data
└── exports/ # Exported files
| Topic | File |
|---|---|
| Setup process | setup.md |
| Memory template | memory-template.md |
| Advanced patterns | patterns.md |
Always run yt-dlp -j URL first. This confirms the video, shows available languages, and reveals official chapters. Never extract blind.
| Subtitle Type | Quality | When to Use |
|---|---|---|
| Manual | High | Always try first |
| Auto-generated | Medium | Fallback only |
Check with yt-dlp --list-subs URL for unfamiliar channels.
Never strip timestamps during any operation. They enable navigation, citation, and deep linking into the video.
| User Response | Action |
|---|---|
| "Yes, save it" | Cache to ~/youtube-video-transcript/videos/ |
| "No thanks" | Don't cache, show once |
| Not asked yet | Ask after first extraction |
Always tell user where files are saved and offer to show or delete them.
If user doesn't specify:
yt-dlp --list-subs "VIDEO_URL"
When extracting quotes for research:
| Subtitle Type | Tell User |
|---|---|
| Manual | "Using official subtitles" |
| Auto-generated | "Using auto-generated (may have errors)" |
| None available | "No subtitles found for this video" |
| Format | Use Case | Command |
|---|---|---|
| Markdown | Reading, notes | Default |
| SRT | Video editors | --sub-format srt |
| Plain text | Search, grep | Strip timestamps |
| JSON | Programmatic | --write-info-json |
| Trap | Consequence | Prevention |
|---|---|---|
| Not checking subtitles first | Wasted time on unavailable video | Always --list-subs first |
| Ignoring auto-generated quality | Garbage text with errors | Prefer manual, warn about auto |
| Losing timestamps | Can't navigate video | Never strip in any operation |
| Extracting without metadata | Missing title, chapters | Always fetch -j first |
| Caching without consent | Privacy violation | Ask before saving |
| User Says | Action |
|---|---|
| "Transcribe this video" | Extract + display |
| "What do they say about X?" | Search + timestamps |
| "Save this transcript" | Cache with confirmation |
| "Export as SRT" | Convert format |
| "Show saved videos" | List ~/youtube-video-transcript/videos/ |
| "Delete video X" | Remove from cache |
Data that stays local (with your consent):
Transparency guarantees:
This skill does NOT:
Install with clawhub install <slug> if user confirms:
summarizer — create summaries from any contentvideo-captions — generate and edit video subtitlesffmpeg — advanced video and audio processingclawhub star youtube-video-transcriptclawhub sync