TikTok Crawling (yt-dlp)
Use for TikTok crawling, content retrieval, and analysis
MIT-0 · Free to use, modify, and redistribute. No attribution required.
⭐ 11 · 3.1k · 12 current installs · 12 all-time installs
MIT-0
Security Scan
OpenClaw
Benign
high confidencePurpose & Capability
Name/description claim TikTok crawling via yt-dlp and the SKILL.md contains only yt-dlp usage patterns, filters, metadata exports, and scheduling examples — everything requested is coherent with a scraping/downloading tool guide. No unrelated credentials, binaries, or installs are demanded by the skill itself.
Instruction Scope
Instructions are focused on scraping and metadata extraction, but they explicitly recommend using --cookies-from-browser (or exporting cookies to a file), cron scheduling, and VPN/geo-bypass techniques. Those steps expand operational scope (accessing browser cookie stores, running scheduled background jobs, altering network identity) and carry privacy, legal, and operational implications even though they are relevant to accessing private/restricted content.
Install Mechanism
This is an instruction-only skill with no install spec or bundled code. The doc suggests standard, well-known install methods (brew, pip) for yt-dlp/ffmpeg but does not perform downloads itself — low install risk.
Credentials
The skill requests no environment variables or credentials, which matches its instruction-only nature. However, the runtime instructions advise accessing browser cookie stores and storing cookies files (sensitive local data) without declaring this as a required config; accessing browser cookies is sensitive and should be treated as such if you follow these instructions.
Persistence & Privilege
Skill flags are default (not always:true). The guide shows how to set up cron jobs or scripts for ongoing scraping, but the skill itself does not request persistent privileges or modify other skills/config — persistence is an operational choice the user would make when deploying the commands.
Assessment
This guide appears internally consistent for a yt-dlp–based TikTok scraper, but pay attention to privacy, legal, and operational risks before using it. Don't hand over browser cookies or cookie files unless you trust the environment (cookies can grant account access). Run scraping in an isolated account, container, or VM to limit exposure, and avoid running scheduled jobs as root. Respect TikTok's terms of service and copyright laws; rate-limit your requests and monitor storage (downloads can be large). If you need to access private/restricted content, prefer using dedicated, minimal credentials or ephemeral cookies and delete them when done. If you’re unsure about legality or data sensitivity, consult legal/privacy resources before proceeding.Like a lobster shell, security has layers — review code before you run it.
Current versionv1.0.0
Download ziplatest
License
MIT-0
Free to use, modify, and redistribute. No attribution required.
SKILL.md
TikTok Scraping with yt-dlp
yt-dlp is a CLI for downloading video/audio from TikTok and many other sites.
Setup
# macOS
brew install yt-dlp ffmpeg
# pip (any platform)
pip install yt-dlp
# Also install ffmpeg separately for merging/post-processing
Download Patterns
Single Video
yt-dlp "https://www.tiktok.com/@handle/video/1234567890"
Entire Profile
yt-dlp "https://www.tiktok.com/@handle" \
-P "./tiktok/data" \
-o "%(uploader)s/%(upload_date)s-%(id)s/video.%(ext)s" \
--write-info-json
Creates:
tiktok/data/
handle/
20260220-7331234567890/
video.mp4
video.info.json
Multiple Profiles
for handle in handle1 handle2 handle3; do
yt-dlp "https://www.tiktok.com/@$handle" \
-P "./tiktok/data" \
-o "%(uploader)s/%(upload_date)s-%(id)s/video.%(ext)s" \
--write-info-json \
--download-archive "./tiktok/downloaded.txt"
done
Search, Hashtags & Sounds
# Search by keyword
yt-dlp "tiktoksearch:cooking recipes" --playlist-end 20
# Hashtag page
yt-dlp "https://www.tiktok.com/tag/booktok" --playlist-end 50
# Videos using a specific sound
yt-dlp "https://www.tiktok.com/music/original-sound-1234567890" --playlist-end 30
Format Selection
# List available formats
yt-dlp -F "https://www.tiktok.com/@handle/video/1234567890"
# Download specific format (e.g., best video without watermark if available)
yt-dlp -f "best" "https://www.tiktok.com/@handle/video/1234567890"
Filtering
By Date
# On or after a date
--dateafter 20260215
# Before a date
--datebefore 20260220
# Exact date
--date 20260215
# Date range
--dateafter 20260210 --datebefore 20260220
# Relative dates (macOS / Linux)
--dateafter "$(date -u -v-7d +%Y%m%d)" # macOS: last 7 days
--dateafter "$(date -u -d '7 days ago' +%Y%m%d)" # Linux: last 7 days
By Metrics & Content
# 100k+ views
--match-filters "view_count >= 100000"
# Duration between 30-60 seconds
--match-filters "duration >= 30 & duration <= 60"
# Title contains "recipe" (case-insensitive)
--match-filters "title ~= (?i)recipe"
# Combine: 50k+ views from Feb 2026
yt-dlp "https://www.tiktok.com/@handle" \
--match-filters "view_count >= 50000" \
--dateafter 20260201
Metadata Only (No Download)
Preview What Would Download
yt-dlp "https://www.tiktok.com/@handle" \
--simulate \
--print "%(upload_date)s | %(view_count)s views | %(title)s"
Export to JSON
# Single JSON array
yt-dlp "https://www.tiktok.com/@handle" --simulate --dump-json > handle_videos.json
# JSONL (one object per line, better for large datasets)
yt-dlp "https://www.tiktok.com/@handle" --simulate -j > handle_videos.jsonl
Export to CSV
yt-dlp "https://www.tiktok.com/@handle" \
--simulate \
--print-to-file "%(uploader)s,%(id)s,%(upload_date)s,%(view_count)s,%(like_count)s,%(webpage_url)s" \
"./tiktok/analysis/metadata.csv"
Analyze with jq
# Top 10 videos by views from downloaded .info.json files
jq -s 'sort_by(.view_count) | reverse | .[:10] | .[] | {title, view_count, url: .webpage_url}' \
tiktok/data/*/*.info.json
# Total views across all videos
jq -s 'map(.view_count) | add' tiktok/data/*/*.info.json
# Videos grouped by upload date
jq -s 'group_by(.upload_date) | map({date: .[0].upload_date, count: length})' \
tiktok/data/*/*.info.json
Tip: For deeper analysis and visualization, load JSONL/CSV exports into Python with
pandas. Useful for engagement scatter plots, posting frequency charts, or comparing metrics across creators.
Ongoing Scraping
Archive (Skip Already Downloaded)
The --download-archive flag tracks downloaded videos, enabling incremental updates:
yt-dlp "https://www.tiktok.com/@handle" \
-P "./tiktok/data" \
-o "%(uploader)s/%(upload_date)s-%(id)s/video.%(ext)s" \
--write-info-json \
--download-archive "./tiktok/downloaded.txt"
Run the same command later—it skips videos already in downloaded.txt.
Authentication (Private/Restricted Content)
# Use cookies from browser (recommended)
yt-dlp --cookies-from-browser chrome "https://www.tiktok.com/@handle"
# Or export cookies to a file first
yt-dlp --cookies tiktok_cookies.txt "https://www.tiktok.com/@handle"
Scheduled Scraping (Cron)
# crontab -e
# Run daily at 2 AM, log output
0 2 * * * cd /path/to/project && ./scripts/scrape-tiktok.sh >> ./tiktok/logs/cron.log 2>&1
Example scripts/scrape-tiktok.sh:
#!/bin/bash
set -e
HANDLES="handle1 handle2 handle3"
DATA_DIR="./tiktok/data"
ARCHIVE="./tiktok/downloaded.txt"
for handle in $HANDLES; do
echo "[$(date)] Scraping @$handle"
yt-dlp "https://www.tiktok.com/@$handle" \
-P "$DATA_DIR" \
-o "%(uploader)s/%(upload_date)s-%(id)s/video.%(ext)s" \
--write-info-json \
--download-archive "$ARCHIVE" \
--cookies-from-browser chrome \
--dateafter "$(date -u -v-7d +%Y%m%d)" \
--sleep-interval 2 \
--max-sleep-interval 5
done
echo "[$(date)] Done"
Troubleshooting
| Problem | Solution |
|---|---|
| Empty results / no videos found | Add --cookies-from-browser chrome — TikTok rate-limits anonymous requests |
| 403 Forbidden errors | Rate limited. Wait 10-15 min, or use cookies/different IP |
| "Video unavailable" | Region-locked. Try --geo-bypass or a VPN |
| Watermarked videos | Check -F for alternative formats; some may lack watermark |
| Slow downloads | Add --concurrent-fragments 4 for faster downloads |
| Profile shows fewer videos than expected | TikTok API limits. Use --playlist-end N explicitly, try with cookies |
Debug Mode
# Verbose output to diagnose issues
yt-dlp -v "https://www.tiktok.com/@handle" 2>&1 | tee debug.log
Reference
Key Options
| Option | Description |
|---|---|
-o TEMPLATE | Output filename template |
-P PATH | Base download directory |
--dateafter DATE | Videos on/after date (YYYYMMDD) |
--datebefore DATE | Videos on/before date |
--playlist-end N | Stop after N videos |
--match-filters EXPR | Filter by metadata (views, duration, title) |
--write-info-json | Save metadata JSON per video |
--download-archive FILE | Track downloads, skip duplicates |
--simulate / -s | Dry run, no download |
-j / --dump-json | Output metadata as JSON |
--cookies-from-browser NAME | Use cookies from browser |
--sleep-interval SEC | Wait between downloads (avoid rate limits) |
Output Template Variables
| Variable | Example Output |
|---|---|
%(id)s | 7331234567890 |
%(uploader)s | handle |
%(upload_date)s | 20260215 |
%(title).50s | First 50 chars of title |
%(view_count)s | 1500000 |
%(like_count)s | 250000 |
%(ext)s | mp4 |
Files
1 totalSelect a file
Select a file to preview.
Comments
Loading comments…
