Install
openclaw skills install @antonia-sz/web-scraper-firecrawlWeb scraping and content extraction using Firecrawl API. Use when users need to crawl websites, extract structured data, convert web pages to markdown, scrape multiple URLs, or build knowledge bases from web content. Supports single page extraction, site-wide crawling, batch processing, and structured data extraction with CSS selectors.
openclaw skills install @antonia-sz/web-scraper-firecrawlPowerful web scraping powered by Firecrawl - turn websites into LLM-ready markdown.
Firecrawl provides APIs for:
requestsSet environment variable:
export FIRECRAWL_API_KEY="fc-your-api-key"
# Basic scrape
firecrawl scrape https://example.com
# With specific options
firecrawl scrape https://example.com --formats markdown,html --only-main-content
# Wait for JS rendering
firecrawl scrape https://spa-app.com --wait-for 2000
# Crawl entire site (up to limit)
firecrawl crawl https://docs.example.com --limit 50
# With depth control
firecrawl crawl https://blog.example.com --max-depth 2 --limit 100
# Include/exclude patterns
firecrawl crawl https://site.com --include "/blog/*" --exclude "/admin/*"
# Custom formats
firecrawl crawl https://docs.example.com --formats markdown,links
# Discover all URLs from a site
firecrawl map https://example.com
# With search term
firecrawl map https://docs.python.org --search "tutorial"
# Scrape multiple URLs
firecrawl batch urls.txt --output ./scraped/
# From JSON list
firecrawl batch urls.json --formats markdown --concurrency 5
# Extract specific data using CSS selectors
firecrawl extract https://example.com/products \
--schema '{"name": ".product-title", "price": ".price", "description": ".desc"}'
# Extract to JSON
firecrawl extract https://news.example.com/article --schema article-schema.json
Clean, LLM-ready markdown with:
Raw or cleaned HTML
Extracted link lists for further crawling
Page screenshot (if requested)
# Crawl documentation site
firecrawl crawl https://docs.framework.com --limit 200 -o ./kb/
# Merge into single file for RAG
cat ./kb/*.md > knowledge-base.md
# Scrape competitor pricing
firecrawl batch competitors.txt --extract pricing-schema.json
# Monitor blog updates
firecrawl map https://blog.company.com --since 2024-01-01
# Export old CMS content
firecrawl crawl https://old-site.com --formats markdown,html -o ./export/
All functionality via scripts/firecrawl.py:
Works well with:
markdown-sync-pro - Sync scraped content to Notion/GitHubarxiv-paper - Combine with academic paper downloadsmaybe-finance - Scrape financial data for analysis