Install
openclaw skills install xh-smart-scraper智能网页数据采集器。自动识别网页结构,批量抓取列表/表格/详情页数据,支持导出JSON/CSV/Excel。内置反爬策略适配。
openclaw skills install xh-smart-scrapercd smart-web-scraper
npm install
# 采集单页
node scraper.js --url "https://example.com/products" --selector ".product-item"
# 批量分页采集
node scraper.js --url "https://example.com/list" --pages 10 --output data.json
# 导出CSV
node scraper.js --url "https://example.com/products" --format csv --output products.csv
{
"target": {
"url": "https://example.com/items",
"pages": 5,
"waitFor": ".loading"
},
"fields": [
{"name": "title", "selector": ".title", "type": "text"},
{"name": "price", "selector": ".price", "type": "text"},
{"name": "image", "selector": "img", "type": "attr", "attr": "src"}
],
"export": {
"format": "json",
"file": "output.json"
}
}
| 场景 | 命令 |
|---|---|
| 电商商品采集 | node scraper.js --url "https://shop.example.com" --selector ".product" |
| 房价数据 | node scraper.js --config housing-config.json |
| 职位列表 | node scraper.js --url "https://jobs.example.com" --pages 20 --delay 2000 |