Sitemap Generator
Generate XML sitemaps by crawling a website. Use when a user needs to create a sitemap.xml for SEO, audit site structure, discover all pages on a domain, or...
MIT-0 · Free to use, modify, and redistribute. No attribution required.
⭐ 0 · 124 · 0 current installs · 0 all-time installs
byJohn Wang@Johnnywang2001
MIT-0
Security Scan
OpenClaw
Benign
high confidencePurpose & Capability
Name/description match the included script and SKILL.md. The Python crawler uses requests and BeautifulSoup as declared, only fetches same-domain HTML pages, skips binary resources, and outputs a sitemap — all expected for this purpose.
Instruction Scope
Instructions are limited to running the provided script with options. The script issues HTTP requests to the target URL(s) and writes an output file; it does not read other system files or environment variables. Notes: it does not check robots.txt (so may crawl pages a site disallows), and it will crawl whatever URL you provide (including internal/private addresses if you pass them), so exercise caution about targets and permissions.
Install Mechanism
No install spec; the skill is instruction+script only. Dependencies are standard pip packages (requests, beautifulsoup4) and are declared in SKILL.md. Nothing is downloaded from arbitrary URLs or installed silently.
Credentials
The skill requests no environment variables or credentials. The script operates with only network access to the user-specified target and local filesystem write access for the output file.
Persistence & Privilege
No special persistence is requested (always:false). The skill does not modify other skills or system config. It runs only when invoked.
Assessment
This skill appears to be what it claims: a local Python crawler that generates sitemap.xml. Before using it, ensure you have permission to crawl the target site (and respect robots.txt even though the script doesn't), avoid pointing it at internal/private URLs you don't want probed, and be careful with the output path (it will overwrite files). Install the declared pip dependencies in a controlled environment. If you need robots.txt compliance or more aggressive rate-limiting/URL canonicalization, review or modify the script before running.Like a lobster shell, security has layers — review code before you run it.
Current versionv1.0.0
Download ziplatest
License
MIT-0
Free to use, modify, and redistribute. No attribution required.
SKILL.md
Sitemap Generator
Crawl any website and produce a standards-compliant XML sitemap ready for search engine submission.
Quick Start
python3 scripts/sitemap_gen.py https://example.com
Output: sitemap.xml in the current directory.
Commands
# Basic — crawl and write sitemap.xml
python3 scripts/sitemap_gen.py https://example.com
# Custom output path
python3 scripts/sitemap_gen.py https://example.com -o /tmp/sitemap.xml
# Limit crawl scope
python3 scripts/sitemap_gen.py https://example.com --max-pages 500 --max-depth 3
# Polite crawling with delay
python3 scripts/sitemap_gen.py https://example.com --delay 1.0
# Set SEO hints
python3 scripts/sitemap_gen.py https://example.com --changefreq daily --priority 0.8
# Verbose progress
python3 scripts/sitemap_gen.py https://example.com -v
# Pipe to stdout
python3 scripts/sitemap_gen.py https://example.com -o -
Options
| Flag | Default | Description |
|---|---|---|
--output, -o | sitemap.xml | Output file path (use - for stdout) |
--max-pages | 200 | Maximum pages to crawl |
--max-depth | 5 | Maximum link depth from start URL |
--delay | 0.2 | Seconds between requests |
--timeout | 10 | Request timeout in seconds |
--changefreq | weekly | Sitemap changefreq hint |
--priority | 0.5 | Sitemap priority hint (0.0–1.0) |
--verbose, -v | off | Print crawl progress to stderr |
Dependencies
pip install requests beautifulsoup4
Notes
- Only crawls same-domain pages (no external links)
- Skips binary files (images, CSS, JS, PDFs, fonts)
- Respects the delay setting to avoid overwhelming servers
- Output conforms to the sitemaps.org 0.9 protocol
Files
2 totalSelect a file
Select a file to preview.
Comments
Loading comments…
