Sitemap Generator

Generate XML sitemaps by crawling a website. Use when a user needs to create a sitemap.xml for SEO, audit site structure, discover all pages on a domain, or...

MIT-0 · Free to use, modify, and redistribute. No attribution required.
0 · 124 · 0 current installs · 0 all-time installs
byJohn Wang@Johnnywang2001
MIT-0
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description match the included script and SKILL.md. The Python crawler uses requests and BeautifulSoup as declared, only fetches same-domain HTML pages, skips binary resources, and outputs a sitemap — all expected for this purpose.
Instruction Scope
Instructions are limited to running the provided script with options. The script issues HTTP requests to the target URL(s) and writes an output file; it does not read other system files or environment variables. Notes: it does not check robots.txt (so may crawl pages a site disallows), and it will crawl whatever URL you provide (including internal/private addresses if you pass them), so exercise caution about targets and permissions.
Install Mechanism
No install spec; the skill is instruction+script only. Dependencies are standard pip packages (requests, beautifulsoup4) and are declared in SKILL.md. Nothing is downloaded from arbitrary URLs or installed silently.
Credentials
The skill requests no environment variables or credentials. The script operates with only network access to the user-specified target and local filesystem write access for the output file.
Persistence & Privilege
No special persistence is requested (always:false). The skill does not modify other skills or system config. It runs only when invoked.
Assessment
This skill appears to be what it claims: a local Python crawler that generates sitemap.xml. Before using it, ensure you have permission to crawl the target site (and respect robots.txt even though the script doesn't), avoid pointing it at internal/private URLs you don't want probed, and be careful with the output path (it will overwrite files). Install the declared pip dependencies in a controlled environment. If you need robots.txt compliance or more aggressive rate-limiting/URL canonicalization, review or modify the script before running.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.0.0
Download zip
latestvk97146y762488t0hy501k7xa7d82qppv

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

SKILL.md

Sitemap Generator

Crawl any website and produce a standards-compliant XML sitemap ready for search engine submission.

Quick Start

python3 scripts/sitemap_gen.py https://example.com

Output: sitemap.xml in the current directory.

Commands

# Basic — crawl and write sitemap.xml
python3 scripts/sitemap_gen.py https://example.com

# Custom output path
python3 scripts/sitemap_gen.py https://example.com -o /tmp/sitemap.xml

# Limit crawl scope
python3 scripts/sitemap_gen.py https://example.com --max-pages 500 --max-depth 3

# Polite crawling with delay
python3 scripts/sitemap_gen.py https://example.com --delay 1.0

# Set SEO hints
python3 scripts/sitemap_gen.py https://example.com --changefreq daily --priority 0.8

# Verbose progress
python3 scripts/sitemap_gen.py https://example.com -v

# Pipe to stdout
python3 scripts/sitemap_gen.py https://example.com -o -

Options

FlagDefaultDescription
--output, -ositemap.xmlOutput file path (use - for stdout)
--max-pages200Maximum pages to crawl
--max-depth5Maximum link depth from start URL
--delay0.2Seconds between requests
--timeout10Request timeout in seconds
--changefreqweeklySitemap changefreq hint
--priority0.5Sitemap priority hint (0.0–1.0)
--verbose, -voffPrint crawl progress to stderr

Dependencies

pip install requests beautifulsoup4

Notes

  • Only crawls same-domain pages (no external links)
  • Skips binary files (images, CSS, JS, PDFs, fonts)
  • Respects the delay setting to avoid overwhelming servers
  • Output conforms to the sitemaps.org 0.9 protocol

Files

2 total
Select a file
Select a file to preview.

Comments

Loading comments…