Skill flagged — review recommended

ClawHub Security found sensitive or high-impact capabilities. Review the scan results before using.

Sitemap Generator

v1.0.0

Generate XML sitemaps by crawling a website or scanning local files. Auto-discovers pages via link extraction. Supports local HTML/MD file scanning with last...

0· 10· 1 versions· 0 current· 0 all-time· Updated 3h ago· MIT-0

Install

openclaw skills install cm-sitemap-generator

Sitemap Generator

Generate XML sitemaps by crawling a live website or scanning local HTML files.

Crawl a Website

python3 scripts/sitemap_gen.py https://example.com

Scan Local Files

python3 scripts/sitemap_gen.py --local ./public --base-url https://example.com

Save to File

# Save sitemap.xml
python3 scripts/sitemap_gen.py https://example.com --output sitemap.xml

# Save sitemap.xml + robots.txt
python3 scripts/sitemap_gen.py https://example.com --output sitemap.xml --robots

Output Formats

# XML (default — valid sitemap.xml)
python3 scripts/sitemap_gen.py https://example.com

# Text (human-readable summary + XML)
python3 scripts/sitemap_gen.py https://example.com --format text

# JSON (pages list + XML string)
python3 scripts/sitemap_gen.py https://example.com --format json

Options

FlagDefaultDescription
--max-pages500Maximum pages to crawl
--timeout10Request timeout in seconds
--output / -ostdoutSave sitemap.xml to file
--robotsoffAlso generate robots.txt
--localoffScan local directory instead of crawling
--base-urlBase URL for local mode (required)
--verbose / -voffShow crawl progress

Features

  • Crawl mode: BFS link discovery, same-domain only, deduplication
  • Local mode: Scan HTML/HTM/MD/PHP files, auto-detect lastmod from file mtime
  • Smart filtering: Skips images, CSS, JS, PDFs, archives, media files
  • URL normalization: Removes fragments, normalizes trailing slashes
  • robots.txt generation: User-agent + Allow + Sitemap reference
  • Valid XML: Proper XML escaping, sitemaps.org schema

Requirements

  • Python 3.6+
  • No external dependencies (stdlib only)

Version tags

latestvk973ws1d0wyf1gsbnr308b4j3n85w6fv