Install
openclaw skills install xcrawl-scrapeUse this skill for XCrawl scrape tasks, including single-URL fetch, format selection, sync or async execution, and JSON extraction with prompt or json_schema.
openclaw skills install xcrawl-scrapeThis skill handles single-page extraction with XCrawl Scrape APIs. Default behavior is raw passthrough: return upstream API response bodies as-is.
Before using this skill, the user must create a local config file and write XCRAWL_API_KEY into it.
Path: ~/.xcrawl/config.json
{
"XCRAWL_API_KEY": "<your_api_key>"
}
Read API key from local config file only. Do not require global environment variables.
Using XCrawl APIs consumes credits.
If the user does not have an account or available credits, guide them to register at https://dash.xcrawl.com/.
After registration, they can activate the free 1000 credits plan before running requests.
Request runtime permissions for curl and node only.
Do not request Python, shell helper scripts, or other runtime permissions.
POST /v1/scrapeGET /v1/scrape/{scrape_id}https://run.xcrawl.comAuthorization: Bearer <XCRAWL_API_KEY>API_KEY="$(node -e "const fs=require('fs');const p=process.env.HOME+'/.xcrawl/config.json';const k=JSON.parse(fs.readFileSync(p,'utf8')).XCRAWL_API_KEY||'';process.stdout.write(k)")"
curl -sS -X POST "https://run.xcrawl.com/v1/scrape" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer ${API_KEY}" \
-d '{"url":"https://example.com","mode":"sync","output":{"formats":["markdown","links"]}}'
API_KEY="$(node -e "const fs=require('fs');const p=process.env.HOME+'/.xcrawl/config.json';const k=JSON.parse(fs.readFileSync(p,'utf8')).XCRAWL_API_KEY||'';process.stdout.write(k)")"
CREATE_RESP="$(curl -sS -X POST "https://run.xcrawl.com/v1/scrape" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer ${API_KEY}" \
-d '{"url":"https://example.com/product/1","mode":"async","output":{"formats":["json"]},"json":{"prompt":"Extract title and price."}}')"
echo "$CREATE_RESP"
SCRAPE_ID="$(node -e 'const s=process.argv[1];const j=JSON.parse(s);process.stdout.write(j.scrape_id||"")' "$CREATE_RESP")"
curl -sS -X GET "https://run.xcrawl.com/v1/scrape/${SCRAPE_ID}" \
-H "Authorization: Bearer ${API_KEY}"
node -e '
const fs=require("fs");
const apiKey=JSON.parse(fs.readFileSync(process.env.HOME+"/.xcrawl/config.json","utf8")).XCRAWL_API_KEY;
const body={url:"https://example.com",mode:"sync",output:{formats:["markdown","json"]},json:{prompt:"Extract title and publish date."}};
fetch("https://run.xcrawl.com/v1/scrape",{
method:"POST",
headers:{"Content-Type":"application/json",Authorization:`Bearer ${apiKey}`},
body:JSON.stringify(body)
}).then(async r=>{console.log(await r.text());});
'
POST https://run.xcrawl.com/v1/scrapeContent-Type: application/jsonAuthorization: Bearer <api_key>| Field | Type | Required | Default | Description |
|---|---|---|---|---|
url | string | Yes | - | Target URL |
mode | string | No | sync | sync or async |
proxy | object | No | - | Proxy config |
request | object | No | - | Request config |
js_render | object | No | - | JS rendering config |
output | object | No | - | Output config |
webhook | object | No | - | Async webhook config (mode=async) |
proxy| Field | Type | Required | Default | Description |
|---|---|---|---|---|
location | string | No | US | ISO-3166-1 alpha-2 country code, e.g. US / JP / SG |
sticky_session | string | No | Auto-generated | Sticky session ID; same ID attempts to reuse exit |
request| Field | Type | Required | Default | Description |
|---|---|---|---|---|
locale | string | No | en-US,en;q=0.9 | Affects Accept-Language |
device | string | No | desktop | desktop / mobile; affects UA and viewport |
cookies | object map | No | - | Cookie key/value pairs |
headers | object map | No | - | Header key/value pairs |
only_main_content | boolean | No | true | Return main content only |
block_ads | boolean | No | true | Attempt to block ad resources |
skip_tls_verification | boolean | No | true | Skip TLS verification |
js_render| Field | Type | Required | Default | Description |
|---|---|---|---|---|
enabled | boolean | No | true | Enable browser rendering |
wait_until | string | No | load | load / domcontentloaded / networkidle |
viewport.width | integer | No | - | Viewport width (desktop 1920, mobile 402) |
viewport.height | integer | No | - | Viewport height (desktop 1080, mobile 874) |
output| Field | Type | Required | Default | Description |
|---|---|---|---|---|
formats | string[] | No | ["markdown"] | Output formats |
screenshot | string | No | viewport | full_page / viewport (only if formats includes screenshot) |
json.prompt | string | No | - | Extraction prompt |
json.json_schema | object | No | - | JSON Schema |
output.formats enum:
htmlraw_htmlmarkdownlinkssummaryscreenshotjsonwebhook| Field | Type | Required | Default | Description |
|---|---|---|---|---|
url | string | No | - | Callback URL |
headers | object map | No | - | Custom callback headers |
events | string[] | No | ["started","completed","failed"] | Events: started / completed / failed |
mode=sync)| Field | Type | Description |
|---|---|---|
scrape_id | string | Task ID |
endpoint | string | Always scrape |
version | string | Version |
status | string | completed / failed |
url | string | Target URL |
data | object | Result data |
started_at | string | Start time (ISO 8601) |
ended_at | string | End time (ISO 8601) |
total_credits_used | integer | Total credits used |
data fields (based on output.formats):
html, raw_html, markdown, links, summary, screenshot, jsonmetadata (page metadata)traffic_bytescredits_usedcredits_detailcredits_detail fields:
| Field | Type | Description |
|---|---|---|
base_cost | integer | Base scrape cost |
traffic_cost | integer | Traffic cost |
json_extract_cost | integer | JSON extraction cost |
mode=async)| Field | Type | Description |
|---|---|---|
scrape_id | string | Task ID |
endpoint | string | Always scrape |
version | string | Version |
status | string | Always pending |
GET /v1/scrape/{scrape_id})| Field | Type | Description |
|---|---|---|
scrape_id | string | Task ID |
endpoint | string | Always scrape |
version | string | Version |
status | string | pending / crawling / completed / failed |
url | string | Target URL |
data | object | Same shape as sync data |
started_at | string | Start time (ISO 8601) |
ended_at | string | End time (ISO 8601) |
output.formats.scrape_id, status, and timestamps.completed or failed.Return:
sync or async)request_payload used for the requestDo not generate summaries unless the user explicitly requests a summary.