Install
openclaw skills install robots-txt-genGenerate, validate, and analyze robots.txt files for websites. Use when creating robots.txt from scratch, validating existing robots.txt syntax, checking if a URL is allowed/blocked by robots.txt rules, or generating robots.txt for common platforms (WordPress, Next.js, Django, Rails). Also use when auditing crawl directives or debugging search engine indexing issues.
openclaw skills install robots-txt-genGenerate, validate, and test robots.txt files from the command line.
# Generate a robots.txt for a platform
python3 scripts/robots_txt_gen.py generate --preset nextjs --sitemap https://example.com/sitemap.xml
# Validate an existing robots.txt
python3 scripts/robots_txt_gen.py validate --file robots.txt
# Validate a remote robots.txt
python3 scripts/robots_txt_gen.py validate --url https://example.com/robots.txt
# Test if a URL is allowed for a user-agent
python3 scripts/robots_txt_gen.py test --file robots.txt --url /admin/dashboard --agent Googlebot
# Generate with custom rules
python3 scripts/robots_txt_gen.py generate --allow "/" --disallow "/admin" --disallow "/api" --disallow "/private" --sitemap https://example.com/sitemap.xml --agent "*"
generateCreate a robots.txt file with custom rules or platform presets.
Options:
--preset <name> — Use a platform preset: wordpress, nextjs, django, rails, laravel, static, spa, ecommerce--agent <name> — User-agent (default: *). Repeat for multiple agents.--allow <path> — Allow path. Repeatable.--disallow <path> — Disallow path. Repeatable.--sitemap <url> — Sitemap URL. Repeatable.--crawl-delay <seconds> — Crawl delay directive.--block-ai — Add rules to block common AI crawlers (GPTBot, ChatGPT-User, CCBot, Google-Extended, anthropic-ai, etc.)--output <file> — Write to file instead of stdout.validateCheck a robots.txt file for syntax errors and best-practice warnings.
Options:
--file <path> — Local file to validate.--url <url> — Remote robots.txt URL to fetch and validate.testTest whether a specific URL path is allowed or disallowed for a given user-agent.
Options:
--file <path> — robots.txt file to test against.--url <path> — URL path to test (e.g., /admin/login).--agent <name> — User-agent to test as (default: Googlebot).| Preset | What it blocks | Notes |
|---|---|---|
wordpress | /wp-admin/, /wp-includes/, query params | Allows /wp-admin/admin-ajax.php |
nextjs | /_next/static/, /api/, /.next/ | Standard Next.js paths |
django | /admin/, /static/admin/, /media/private/ | Django admin and private media |
rails | /admin/, /assets/, /tmp/ | Rails conventions |
laravel | /admin/, /storage/, /vendor/ | Laravel conventions |
static | Nothing blocked | Simple allow-all with sitemap |
spa | /api/, /assets/ | Single-page app pattern |
ecommerce | /cart/, /checkout/, /account/, /search? | Prevents crawling user sessions |
The --block-ai flag adds disallow rules for known AI training crawlers: