Install
openclaw skills install reddit-archiveDownload and archive Reddit posts including images, GIFs, and videos from specified users or subreddits with filtering and sorting options.
openclaw skills install reddit-archiveDownload and archive Reddit posts (images, GIFs, videos) from users or subreddits.
This script automatically checks for and installs its dependencies on first run:
If missing, it will attempt to install them via pip install --user. You can also:
pip3 install requests yt-dlpexport YTDLP_PATH=/your/custom/path/yt-dlpAs of mid-2026, downloading v.redd.it videos requires an authenticated
Reddit session — yt-dlp's Reddit extractor reads cookies from your
browser to satisfy this. Stay logged into Reddit in Safari (or
another browser, see below) and the script handles it automatically.
safari (macOS default).export REDDIT_COOKIES_BROWSER=chrome (or firefox,
brave, edge, vivaldi). Set to none to skip cookie loading
if you don't need Reddit videos.v.redd.it posts will fail with an
Account authentication is required error.You want to archive content from Reddit — either from a specific user (u/username) or a subreddit (r/subname).
python3 ~/path/to/reddit_archive.py [options]
| Flag | Description | Default |
|---|---|---|
-u, --user | Reddit username (either this OR --subreddit required) | — |
-s, --subreddit | Subreddit name (either this OR --user required) | — |
-o, --output | Output directory | ~/temp/.reddit_<target> |
--sort | Sort order: hot, new, rising, top, controversial | hot |
--time | Time filter for top/controversial: hour, day, week, month, year, all | — |
--after | Start date (YYYY-MM-DD) | No filter |
--before | End date (YYYY-MM-DD) | No filter |
--limit | Max posts to fetch (0 = unlimited) | 0 |
--images | Download images (jpg, png, webp) | ✓ |
--gifs | Download GIFs/videos (gfycat, redgifs, imgur) | ✓ |
--skip-existing | Skip already-downloaded files | ✓ |
--workers | Parallel download workers | 4 |
# All posts from a user
python3 reddit_archive.py -u someuser
# Subreddit with date range
python3 reddit_archive.py -s orlando --after 2025-01-01 --before 2025-12-31
# Top 10 most upvoted posts of all time from a subreddit
python3 reddit_archive.py -s funny --sort top --time all --limit 10
# New posts only
python3 reddit_archive.py -s orlando --sort new
# GIFs only, specific user
python3 reddit_archive.py -u someguy --gifs
# Custom output dir
python3 reddit_archive.py -u someuser -o ~/Downloads/reddit_archive
Downloads are saved to the output directory with the following structure:
output_directory/
├── Pictures/
│ ├── {target}_{post_id}.jpg
│ ├── {target}_{post_id}.png
│ └── ...
└── Videos/
├── {target}_{post_id}.mp4
└── ...
The skill is organized as:
reddit-archive/
├── SKILL.md ← This file
└── scripts/
├── reddit_archive.py ← Main downloader script
└── requirements.txt ← Python dependencies
over18 cookie so NSFW subreddits don't return an interstitialold.reddit.com/r/<name>/<sort>/ or
old.reddit.com/user/<name>/submitted/). Reddit's anonymous JSON API
started returning 403 + an anti-bot HTML page in mid-2026, and the
self-serve OAuth flow is gated behind a Responsible Builder Policy
approval. old.reddit's server-rendered listings still work and embed
the same metadata in <div class="thing" data-*> attributes (schema
stable since ~2010).after=t3_<id> cursor extracted from the
page's next › button rather than a JSON after field.preview.redd.it/<id>.<ext> URLs
for each gallery item inline. Each image is also available unsigned at
i.redd.it/<id>.<ext> (full resolution, no expiry), which is what we
download.yt-dlp with
--cookies-from-browser (HTML scraping doesn't expose the DASH
manifest URL the way the old JSON API did, and yt-dlp's Reddit
extractor in 2026 needs an authenticated session to fetch the
manifest itself).yt-dlp (redgifs, gfycat, v.redd.it);
direct images and direct mp4/gif URLs are streamed via requests.created_utc, which we derive from data-timestamp).