XML Sitemap Generator

v1.0.0

Generate XML sitemaps by crawling a website or scanning local files. Auto-discovers pages via link extraction. Supports local HTML/MD file scanning with last...

⭐ 0· 73·0 current·0 all-time

by@charlie-morrison

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for charlie-morrison/xml-sitemap-generator.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "XML Sitemap Generator" (charlie-morrison/xml-sitemap-generator) from ClawHub.
Skill page: https://clawhub.ai/charlie-morrison/xml-sitemap-generator
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install xml-sitemap-generator

ClawHub CLI

Package manager switcher

npx clawhub@latest install xml-sitemap-generator

Security Scan

VirusTotal

Suspicious

View report →

OpenClaw

Benign

medium confidence

✓

Purpose & Capability

Name/description match the included Python script which crawls same-domain URLs and/or scans local HTML/MD files, extracts links, and generates sitemap.xml and robots.txt. No unrelated credentials, binaries, or installs are requested.

ℹ

Instruction Scope

SKILL.md instructs running the included script against a URL or local directory — that aligns with purpose. The script performs network requests (crawling) and reads local files when run in local mode. One noteworthy implementation detail: the HTTP client disables TLS certificate verification (ssl.check_hostname=False and verify_mode=ssl.CERT_NONE), which weakens TLS checks and can expose you to MITM when crawling HTTPS sites.

✓

Install Mechanism

No install spec and no external downloads; this is an instruction-only skill bundled with a single Python script relying only on the standard library. No archive downloads or external package installs were requested.

✓

Credentials

The skill requests no environment variables, credentials, or config paths. The code does not read environment secrets. Local file scanning legitimately reads files from the supplied directory (expected for this functionality).

✓

Persistence & Privilege

The skill is not always-enabled and does not request persistent platform privileges. It can run network requests and read local files when invoked (normal for this functionality). If you allow autonomous agent invocation, it could be asked to crawl arbitrary URLs — a general platform risk but not a misalignment in this skill itself.

Assessment

This skill is internally coherent for generating sitemaps, but review before running: (1) The crawler disables TLS certificate verification — consider enabling verification to avoid MITM risks (remove or change the ssl context settings). (2) Running the skill will make outgoing HTTP(S) requests to any target you provide and will read any local files in the directory you point it at; only run it against sites and directories you trust, and consider running it in an isolated environment if you plan to crawl untrusted or internal network hosts. (3) If you plan to allow autonomous agent use, be aware it could be instructed to probe internal endpoints — restrict agent permissions or network access if that is a concern. If you want higher assurance, review the rest of the truncated main() logic or run the script locally on a test machine first.

Like a lobster shell, security has layers — review code before you run it.

latestvk9751vkzqndth1xxykj1m8ad5x84sn5g

73downloads

0stars

1versions

Updated 2w ago

v1.0.0

MIT-0

Sitemap Generator

Generate XML sitemaps by crawling a live website or scanning local HTML files.

Crawl a Website

python3 scripts/sitemap_gen.py https://example.com

Scan Local Files

python3 scripts/sitemap_gen.py --local ./public --base-url https://example.com

Save to File

# Save sitemap.xml
python3 scripts/sitemap_gen.py https://example.com --output sitemap.xml

# Save sitemap.xml + robots.txt
python3 scripts/sitemap_gen.py https://example.com --output sitemap.xml --robots

Output Formats

# XML (default — valid sitemap.xml)
python3 scripts/sitemap_gen.py https://example.com

# Text (human-readable summary + XML)
python3 scripts/sitemap_gen.py https://example.com --format text

# JSON (pages list + XML string)
python3 scripts/sitemap_gen.py https://example.com --format json

Options

Flag	Default	Description
`--max-pages`	500	Maximum pages to crawl
`--timeout`	10	Request timeout in seconds
`--output` / `-o`	stdout	Save sitemap.xml to file
`--robots`	off	Also generate robots.txt
`--local`	off	Scan local directory instead of crawling
`--base-url`	—	Base URL for local mode (required)
`--verbose` / `-v`	off	Show crawl progress

Features

Crawl mode: BFS link discovery, same-domain only, deduplication
Local mode: Scan HTML/HTM/MD/PHP files, auto-detect lastmod from file mtime
Smart filtering: Skips images, CSS, JS, PDFs, archives, media files
URL normalization: Removes fragments, normalizes trailing slashes
robots.txt generation: User-agent + Allow + Sitemap reference
Valid XML: Proper XML escaping, sitemaps.org schema

Requirements

Python 3.6+
No external dependencies (stdlib only)

Comments

Loading comments...