Sitemap Generator

Generate XML sitemaps by crawling a website. Use when a user needs to create a sitemap.xml for SEO, audit site structure, discover all pages on a domain, or...

MIT-0 · Free to use, modify, and redistribute. No attribution required.

⭐ 0 · 124 · 0 current installs · 0 all-time installs

byJohn Wang@Johnnywang2001

MIT-0

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Benign

high confidence

✓

Purpose & Capability

Name/description match the included script and SKILL.md. The Python crawler uses requests and BeautifulSoup as declared, only fetches same-domain HTML pages, skips binary resources, and outputs a sitemap — all expected for this purpose.

ℹ

Instruction Scope

Instructions are limited to running the provided script with options. The script issues HTTP requests to the target URL(s) and writes an output file; it does not read other system files or environment variables. Notes: it does not check robots.txt (so may crawl pages a site disallows), and it will crawl whatever URL you provide (including internal/private addresses if you pass them), so exercise caution about targets and permissions.

✓

Install Mechanism

No install spec; the skill is instruction+script only. Dependencies are standard pip packages (requests, beautifulsoup4) and are declared in SKILL.md. Nothing is downloaded from arbitrary URLs or installed silently.

✓

Credentials

The skill requests no environment variables or credentials. The script operates with only network access to the user-specified target and local filesystem write access for the output file.

✓

Persistence & Privilege

No special persistence is requested (always:false). The skill does not modify other skills or system config. It runs only when invoked.

Assessment

This skill appears to be what it claims: a local Python crawler that generates sitemap.xml. Before using it, ensure you have permission to crawl the target site (and respect robots.txt even though the script doesn't), avoid pointing it at internal/private URLs you don't want probed, and be careful with the output path (it will overwrite files). Install the declared pip dependencies in a controlled environment. If you need robots.txt compliance or more aggressive rate-limiting/URL canonicalization, review or modify the script before running.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.0.0

Download zip

latestvk97146y762488t0hy501k7xa7d82qppv

License

MIT-0

Free to use, modify, and redistribute. No attribution required.

Termshttps://spdx.org/licenses/MIT-0.html

SKILL.md

Sitemap Generator

Crawl any website and produce a standards-compliant XML sitemap ready for search engine submission.

Quick Start

python3 scripts/sitemap_gen.py https://example.com

Output: sitemap.xml in the current directory.

Commands

# Basic — crawl and write sitemap.xml
python3 scripts/sitemap_gen.py https://example.com

# Custom output path
python3 scripts/sitemap_gen.py https://example.com -o /tmp/sitemap.xml

# Limit crawl scope
python3 scripts/sitemap_gen.py https://example.com --max-pages 500 --max-depth 3

# Polite crawling with delay
python3 scripts/sitemap_gen.py https://example.com --delay 1.0

# Set SEO hints
python3 scripts/sitemap_gen.py https://example.com --changefreq daily --priority 0.8

# Verbose progress
python3 scripts/sitemap_gen.py https://example.com -v

# Pipe to stdout
python3 scripts/sitemap_gen.py https://example.com -o -

Options

Flag	Default	Description
`--output, -o`	`sitemap.xml`	Output file path (use `-` for stdout)
`--max-pages`	`200`	Maximum pages to crawl
`--max-depth`	`5`	Maximum link depth from start URL
`--delay`	`0.2`	Seconds between requests
`--timeout`	`10`	Request timeout in seconds
`--changefreq`	`weekly`	Sitemap changefreq hint
`--priority`	`0.5`	Sitemap priority hint (0.0–1.0)
`--verbose, -v`	off	Print crawl progress to stderr

Dependencies

pip install requests beautifulsoup4

Notes

Only crawls same-domain pages (no external links)
Skips binary files (images, CSS, JS, PDFs, fonts)
Respects the delay setting to avoid overwhelming servers
Output conforms to the sitemaps.org 0.9 protocol

Files

2 total

Select a file

Select a file to preview.

Comments

Loading comments…