#web-scraping #数据提取 #ai-1-0-0 #json-1-0-0

XCrawl Scraper

XCrawl - AI-Powered Web Scraping API / AI 驱动网页爬虫，支持结构化数据提取

Install

openclaw skills install @zhangss110/xcrawl-scraper

🕷️ XCrawl Scraper / XCrawl 网页爬虫

AI-Powered Web Scraping API - 结构化数据提取利器
支持 Markdown、HTML、JSON、Screenshot 等多种格式输出

✨ 功能特点 / Features

功能	说明
🏷️ 网页爬取	支持 Markdown、HTML、JSON、Screenshot
🔍 搜索	搜索引擎结果爬取
🗺️ 网站地图	自动发现站点所有页面
🕷️ 站点爬取	批量爬取整个站点
📊 结构化数据	JSON Schema 自动提取结构化数据
🌐 代理支持	全球代理可选

📦 安装 / Installation

方式一：运行安装脚本

bash

scripts\install.bat

方式二：手动安装

bash

# 1. 安装依赖
pip install xcrawl

# 2. 配置 API Key
python scripts\xcrawl_scraper.py set-key YOUR_API_KEY

⚙️ 获取 API Key

访问 https://xcrawl.com 注册账号
获取 API Key
配置:

bash

python scripts\xcrawl_scraper.py set-key YOUR_API_KEY

📖 使用方法 / Usage

1. 爬取网页 (基本)

bash

python scripts\xcrawl_scraper.py scrape https://example.com markdown

2. 爬取多个格式

bash

python scripts\xcrawl_scraper.py scrape https://example.com markdown html links

3. 结构化数据提取 (JSON)

bash

python scripts\xcrawl_scraper.py scrape https://example.com json "提取产品名称和价格"

4. 搜索

bash

python scripts\xcrawl_scraper.py search "web scraping"

5. 网站地图

bash

python scripts\xcrawl_scraper.py map https://example.com

6. 站点爬取

bash

python scripts\xcrawl_scraper.py crawl https://example.com

📋 命令列表

命令	说明
`scrape <URL> [formats...]`	爬取网页
`search <query>`	搜索
`map <URL>`	网站地图
`crawl <URL>`	站点爬取
`set-key <API_KEY>`	设置 API Key
`config`	显示配置

🔧 配置 / Configuration

配置文件: scripts/config.json

json

{
  "apiKey": "YOUR_API_KEY",
  "apiUrl": "https://run.xcrawl.com",
  "timeout": 60,
  "defaultFormats": ["markdown"],
  "defaultProxy": ""
}

📝 示例输出

Markdown 输出

markdown

# Example Domain

This domain is for use in illustrative examples in documents.

JSON 输出

json

{
  "product_name": "iPhone 15 Pro",
  "price": 999,
  "currency": "USD"
}

📦 依赖

Python >= 3.8
xcrawl SDK

🤝 贡献 / Contributing

欢迎提交 Issue 和 Pull Request！

📄 许可证

MIT License

XCrawl Scraper

Install

🕷️ XCrawl Scraper / XCrawl 网页爬虫

✨ 功能特点 / Features

📦 安装 / Installation

方式一：运行安装脚本

方式二：手动安装

⚙️ 获取 API Key

📖 使用方法 / Usage

1. 爬取网页 (基本)

2. 爬取多个格式

3. 结构化数据提取 (JSON)

4. 搜索

5. 网站地图

6. 站点爬取

📋 命令列表

🔧 配置 / Configuration

📝 示例输出

Markdown 输出

JSON 输出

📦 依赖

🤝 贡献 / Contributing

📄 许可证

Related skills