Install
openclaw skills install smart-scraper-webExtract structured data from websites. Tables, lists, prices, articles, metadata. HTML parsing with caching. Zero external dependencies.
openclaw skills install smart-scraper-webStop copying data by hand. Start extracting it automatically.
Web content is everywhere but inaccessible to agents. web_fetch gets raw HTML, but you need structure — tables, prices, lists, article text — to make it useful.
Smart Scraper turns raw HTML into structured data with one command.
node skills/smart-scraper/smart-scraper.js --extract https://example.com
Returns title, headings, paragraphs, links, tables, lists, prices, images, and metadata.
node skills/smart-scraper/smart-scraper.js --extract --table https://example.com/pricing
node skills/smart-scraper/smart-scraper.js --extract --list https://example.com/blog
node skills/smart-scraper/smart-scraper.js --extract --price https://example.com/products
node skills/smart-scraper/smart-scraper.js --extract --article https://example.com/blog/post
node skills/smart-scraper/smart-scraper.js --parse "<html>...</html>"
node skills/smart-scraper/smart-scraper.js --status
--statusCache stored in: memory/scraper-cache/cache.json
Override data directory:
--dir /path/to/data
{0,N} limits to prevent ReDoSWhen extracting web content:
--extract <url> for a full overview--extract --table/list/price/article for focused extraction--parse when you already have HTML from another tool--status to monitor cache usage| Tool | Structure | Tables | Prices | Articles | Caching |
|---|---|---|---|---|---|
web_fetch | Raw HTML | ❌ | ❌ | ❌ | ❌ |
| Puppeteer | ✅ | ✅ | ✅ | ✅ | ❌ |
| Smart Scraper | ✅ | ✅ | ✅ | ✅ | ✅ |
Smart Scraper gives you structured extraction + caching with zero dependencies.