Web Scraper

Data & APIs

Extract and export structured web data like text, tables, and images in JSON, CSV, Markdown, or SQL formats while respecting ethical and legal scraping pract...

Install

openclaw skills install rupert-web-scraper

Web Scraper Skill

Overview

Extract data from websites efficiently and ethically.

Capabilities

1. Data Extraction

  • Extract text content
  • Pull structured data
  • Capture tables
  • Get images/media

2. Formats

  • JSON output
  • CSV export
  • Markdown
  • SQL inserts

3. Features

  • Rate limiting
  • Caching
  • Retry logic
  • Error handling
  • Proxy support

4. Ethical Scraping

  • Respect robots.txt
  • Rate limits
  • User agent rotation
  • Legal compliance

Usage

Commands

  • scrape [URL] for [data]
  • extract [element] from [URL]
  • get table from [URL]
  • crawl [website] depth [n]
  • export [URL] to [format]

Examples

Input: "scrape example.com for product names and prices" Output:

{
  "products": [
    {"name": "Product A", "price": "$19.99"},
    {"name": "Product B", "price": "$29.99"}
  ]
}

Configuration

Rate Limits

  • Default: 1 request/second
  • Configurable: 0.1-10 req/s
  • Respect site limits

Output Options

  • JSON (default)
  • CSV
  • Markdown
  • SQL
  • Custom template

Best Practices

  1. Always identify yourself
  2. Cache responses
  3. Handle errors gracefully
  4. Stay within legal bounds
  5. Don't overwhelm servers