Install
openclaw skills install url2mdThe skill url2md converts HTML web pages from HTTP/HTTPS URLs to clean, readable Markdown files with optional batch processing and formatting features.
openclaw skills install url2mdConvert web pages to clean, readable Markdown.
python3 scripts/url2md.py https://example.com/article
Output to a file:
python3 scripts/url2md.py https://example.com/article -o article.md
Create a file with URLs (one per line):
https://example.com/article-1
https://example.com/article-2
https://example.com/article-3
Convert all and save to a directory:
python3 scripts/url2md.py -f urls.txt -d ./markdown_files/
urllib, html.parser)script/style/noscript/template, then prefers the first <article> or <main> (else <body>) so output focuses on primary contentog:title / Twitter title when present, otherwise <title>, added as H1 when enabled<meta> tags and Schema.org JSON-LD for knowledge-base workflows{{title}}, {{content}}, {{author}}, {{published}}, {{date}}, etc.)scripts/url2md.pyUsage:
url2md.py [url] [options]
Options:
| Option | Description |
|---|---|
url | Single URL to convert |
-o, --output | Output file (default: stdout) |
-f, --file | File containing URLs to convert |
-d, --dir | Output directory for batch conversion |
--no-title | Skip adding page title as H1 |
--full-page | Parse full <body> instead of <article>/<main> first (more chrome, wider coverage) |
--timeout | Request timeout in seconds (default: 30) |
--frontmatter | Add YAML frontmatter with extracted metadata |
-t, --template | Path to a template file for customizing output |
--filename-template | Batch mode filename pattern (e.g. {{date}}-{{title}}.md) |
--download-images | Download remote images to a local folder (e.g. assets) |
-v, --version | Show version |
Examples:
# Single URL to stdout
python3 scripts/url2md.py https://docs.python.org/3
# Save to file
python3 scripts/url2md.py https://docs.python.org/3 -o python-docs.md
# Batch with custom timeout
python3 scripts/url2md.py -f urls.txt -d ./output/ --timeout 60
# Skip title
python3 scripts/url2md.py https://example.com --no-title
# Whole body (no article/main focus)
python3 scripts/url2md.py https://example.com/sitemap --full-page -o sitemap.md
# YAML frontmatter (great for Obsidian / PKM)
python3 scripts/url2md.py https://example.com/article --frontmatter -o article.md
# Custom template
python3 scripts/url2md.py https://example.com/article -t article.tpl -o article.md
# Batch with smart filenames
python3 scripts/url2md.py -f urls.txt -d ./output/ --filename-template "{{date}}-{{title}}.md"
# Download images locally
python3 scripts/url2md.py https://example.com/article -o article.md --download-images assets
python3 scripts/url2md.py -f urls.txt -d ./output/ --download-images assets
Template variables: {{title}}, {{content}}, {{url}}, {{source}}, {{author}}, {{published}}, {{description}}, {{category}}, {{site_name}}, {{date}}, {{datetime}}