Install
openclaw skills install gmail-link-archiverConnects to Gmail via IMAP, filters emails by subject prefix keyword in a specified mailbox, crawls links found in filtered emails using Playwright (to bypass bot detection), converts crawled content to Markdown, and saves it to the OpenClaw workspace. Use when the user wants to archive web content from email links, save newsletter links as Markdown, or crawl URLs from filtered emails.
openclaw skills install gmail-link-archiverArchive web content from your email links. This skill connects to Gmail via IMAP, filters emails by a subject prefix keyword, crawls every link using Playwright (headless Chromium), converts pages to Markdown, and saves them to your OpenClaw workspace.
bash references/setup.sh
This automatically installs:
playwright (Python) + Chromium browser binaryhtml2text for HTML→Markdown conversionpython3 references/gmail_link_archiver.py
The first run will prompt you for:
| Setting | Description | Default |
|---|---|---|
| IMAP server | Gmail IMAP host | imap.gmail.com |
| IMAP port | SSL port | 993 |
| Gmail address | Your full email address | — |
| App password | Gmail App Password (NOT your regular password) | — |
| Default mailbox | IMAP folder to search | INBOX |
| Subject prefix | Filter emails whose subject starts with this | — |
| Workspace path | Where to save Markdown files | ~/openclaw-workspace/mail-archive |
Credentials are saved locally to ~/.config/gmail-link-archiver/config.json with 0600 permissions. They are never transmitted or logged.
Gmail App Password: You need to generate an App Password at https://myaccount.google.com/apppasswords (requires 2FA enabled).
After the first setup, subsequent runs will read credentials from the saved config:
# Use saved config defaults
python3 references/gmail_link_archiver.py
# Override mailbox and prefix on the fly
python3 references/gmail_link_archiver.py --mailbox "INBOX" --subject-prefix "[Newsletter]"
# Save to a different workspace
python3 references/gmail_link_archiver.py --workspace ~/my-archive
# Limit number of links to crawl
python3 references/gmail_link_archiver.py --max-links 10
# Re-run the setup interview
python3 references/gmail_link_archiver.py --reconfigure
Gmail IMAP ──► Filter by Subject ──► Extract Links
│
▼
Playwright + Chromium (headless)
│
▼
HTML → Markdown (html2text)
│
▼
Save to OpenClaw Workspace
usage: gmail_link_archiver.py [-h] [--mailbox MAILBOX]
[--subject-prefix PREFIX]
[--workspace PATH]
[--max-links N]
[--reconfigure]
Options:
--mailbox, -m IMAP mailbox to search (default: from config)
--subject-prefix, -s Subject prefix to filter emails
--workspace, -w Directory to save Markdown files
--max-links Max number of links to crawl (default: 50)
--reconfigure Re-run the setup interview
Each crawled page is saved as a Markdown file with YAML frontmatter:
---
source: https://example.com/article
crawled_at: 2026-03-27T12:00:00Z
---
# Article Title
Article content converted to clean Markdown...
Files are named using a sanitized version of the URL plus a short hash for uniqueness.
Ask Claude to run the archiver:
"Run the Gmail Link Archiver to crawl links from my emails with subject starting with '[ReadLater]'"
Claude will execute:
python3 references/gmail_link_archiver.py --subject-prefix "[ReadLater]"
Or to set up fresh:
"Set up the Gmail Link Archiver with my credentials"
python3 references/gmail_link_archiver.py --reconfigure
"App password" rejected?
Playwright/Chromium issues?
# Reinstall Chromium
python3 -m playwright install chromium
# Install system dependencies (Linux)
sudo python3 -m playwright install-deps chromium
No emails found?
INBOX, [Gmail]/All Mail, etc.)Permission denied on config file?
chmod 600 ~/.config/gmail-link-archiver/config.json
~/.config/gmail-link-archiver/config.json0600 (owner read/write only)0700 permissions