Letterboxd Watchlist
v0.1.2Scrape a public Letterboxd user's watchlist into a CSV/JSONL list of titles and film URLs without logging in. Use when a user asks to export, scrape, or mirror a Letterboxd watchlist, or to build watch-next queues.
Security Scan
OpenClaw
Benign
high confidencePurpose & Capability
The name/description match the included code: the script scrapes a public Letterboxd watchlist and outputs CSV/JSONL. One minor inconsistency: SKILL.md shows the usage as `uv run scripts/scrape_watchlist.py` but the skill metadata lists no required binaries. The script requires a Python 3 runtime (and the agent environment needs either a way to run Python scripts or the 'uv' runner referenced).
Instruction Scope
SKILL.md instructions stay on task: they instruct the agent to ask for the username, run the bundled scraper, and produce CSV/JSONL output. The docs explicitly limit scope and say not to read local folders or perform unrelated follow-ups. The scraper only fetches Letterboxd pages and writes a local output file.
Install Mechanism
No install spec is provided (instruction-only plus a script file). Nothing is downloaded at install time and no archives or external installers are referenced, which is low-risk.
Credentials
The skill requests no environment variables, credentials, or config paths. The script performs unauthenticated HTTP GETs to letterboxd.com only and writes a local file, which is proportionate to the stated purpose.
Persistence & Privilege
The skill does not request permanent presence (always: false) and does not modify other skills or system/global config. It runs only when invoked.
Assessment
This skill appears to do exactly what it says: a small Python scraper that fetches public Letterboxd watchlist pages and writes CSV/JSONL. Before installing or running it, consider: (1) Ensure your agent environment can run the script (Python 3 is required; SKILL.md references `uv run` which is not declared as a required binary). (2) Review the script yourself — it performs network requests and writes files locally but does not exfiltrate data to third parties. (3) Be mindful of scraping etiquette and Letterboxd's terms of service and rate limits: default max-pages=500 could generate many requests (the script has a 250 ms default delay and retry logic, but you may want to lower max-pages or increase delay). (4) The HTML parsing uses a regex and may break if Letterboxd changes markup; that is a reliability, not a security, concern. If you are uncomfortable running arbitrary scripts, run it in an isolated environment or review/modify the code first.Like a lobster shell, security has layers — review code before you run it.
latest
Letterboxd Watchlist Scraper
Use the bundled script to scrape a public Letterboxd watchlist (no auth). Always ask the user for the Letterboxd username if they did not provide one.
Script
scripts/scrape_watchlist.py
Basic usage
uv run scripts/scrape_watchlist.py <username> --out watchlist.csv
Robust mode (recommended)
uv run scripts/scrape_watchlist.py <username> --out watchlist.jsonl --delay-ms 300 --timeout 30 --retries 2
Output formats
--out *.csv→title,link--out *.jsonl→ one JSON object per line:{ "title": "…", "link": "…" }
Notes / gotchas
- Letterboxd usernames are case-insensitive, but must be exact.
- The script scrapes paginated pages:
/watchlist/page/<n>/. - Stop condition: first page with no
data-target-link="/film/..."poster entries. - The scraper validates username format (
[A-Za-z0-9_-]+) and uses retries + timeout. - Default crawl delay is 250ms/page to be polite and reduce transient failures.
- This is best-effort HTML scraping; if Letterboxd changes markup, adjust the regex in the script.
Scope boundary
- This skill only scrapes a public Letterboxd watchlist and writes CSV/JSONL output.
- Do not read local folders, scan libraries, or perform unrelated follow-up actions unless explicitly requested by the user.
Comments
Loading comments...
