XCrawl Scrape

v1.0.2

Use this skill for XCrawl scrape tasks, including single-URL fetch, format selection, sync or async execution, and JSON extraction with prompt or json_schema.

⭐ 0· 1.3k·9 current·10 all-time

by@wykings

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for wykings/xcrawl-scrape.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "XCrawl Scrape" (wykings/xcrawl-scrape) from ClawHub.
Skill page: https://clawhub.ai/wykings/xcrawl-scrape
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install xcrawl-scrape

ClawHub CLI

Package manager switcher

npx clawhub@latest install xcrawl-scrape

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Benign

high confidence

✓

Purpose & Capability

The name/description (XCrawl scraping: single-URL fetch, sync/async, JSON extraction) aligns with the actual instructions which call https://run.xcrawl.com/v1/scrape. The declared tool dependencies (curl or node) and the local config file (~/.xcrawl/config.json) are coherent for an API-key based client.

✓

Instruction Scope

SKILL.md instructs only to read the local config file for XCRAWL_API_KEY, then make HTTP requests to the XCrawl API using curl or node. It does not instruct reading other user files, scanning the system, or transmitting unrelated data. The webhook feature exists as part of the API (user-specified callback URL) — expected for async scrape workflows but a user-controlled sink.

✓

Install Mechanism

This is an instruction-only skill with no install spec and no code files, so nothing is downloaded or written by an installer. That minimizes install-time risk.

✓

Credentials

No global environment variables or external credentials are requested. The skill requires a single locally stored API key (~/.xcrawl/config.json), which is proportionate to calling a paid scraping API. Note: storing an API key as plaintext in a home directory is sensitive but is a user configuration choice rather than a requirement of the skill.

✓

Persistence & Privilege

The skill does not request always:true and is user-invocable only. It does request runtime execution permissions for curl/node (consistent with examples). There is no instruction to modify other skills or system-wide agent settings.

Assessment

This skill is internally consistent with a client for the XCrawl scraping API. Before installing: (1) Confirm you trust xcrawl.com and your API key issuer; (2) Consider storing your API key in a secure secret manager rather than an unencrypted ~/.xcrawl/config.json file; (3) Be cautious with webhook URLs you supply — they will receive scraped content; (4) Review any use of skip_tls_verification or proxies in request options (these lower network security); (5) Remember this skill runs curl/node commands at runtime (so the agent will need those tools available). If any of the above is unacceptable, don't install or ask the publisher for alternative auth/storage methods.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

Any bincurl, node

latestvk972nv6mmzbkxg5tatt2mxcwax82r83k

1.3kdownloads

0stars

3versions

Updated 1mo ago

v1.0.2

MIT-0

XCrawl Scrape

Overview

This skill handles single-page extraction with XCrawl Scrape APIs. Default behavior is raw passthrough: return upstream API response bodies as-is.

Required Local Config

Before using this skill, the user must create a local config file and write XCRAWL_API_KEY into it.

Path: ~/.xcrawl/config.json

{
  "XCRAWL_API_KEY": "<your_api_key>"
}

Read API key from local config file only. Do not require global environment variables.

Credits and Account Setup

Using XCrawl APIs consumes credits. If the user does not have an account or available credits, guide them to register at https://dash.xcrawl.com/. After registration, they can activate the free 1000 credits plan before running requests.

Tool Permission Policy

Request runtime permissions for curl and node only. Do not request Python, shell helper scripts, or other runtime permissions.

API Surface

Start scrape: POST /v1/scrape
Read async result: GET /v1/scrape/{scrape_id}
Base URL: https://run.xcrawl.com
Required header: Authorization: Bearer <XCRAWL_API_KEY>

Usage Examples

cURL (sync)

API_KEY="$(node -e "const fs=require('fs');const p=process.env.HOME+'/.xcrawl/config.json';const k=JSON.parse(fs.readFileSync(p,'utf8')).XCRAWL_API_KEY||'';process.stdout.write(k)")"

curl -sS -X POST "https://run.xcrawl.com/v1/scrape" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${API_KEY}" \
  -d '{"url":"https://example.com","mode":"sync","output":{"formats":["markdown","links"]}}'

cURL (async create + result)

API_KEY="$(node -e "const fs=require('fs');const p=process.env.HOME+'/.xcrawl/config.json';const k=JSON.parse(fs.readFileSync(p,'utf8')).XCRAWL_API_KEY||'';process.stdout.write(k)")"

CREATE_RESP="$(curl -sS -X POST "https://run.xcrawl.com/v1/scrape" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${API_KEY}" \
  -d '{"url":"https://example.com/product/1","mode":"async","output":{"formats":["json"]},"json":{"prompt":"Extract title and price."}}')"

echo "$CREATE_RESP"

SCRAPE_ID="$(node -e 'const s=process.argv[1];const j=JSON.parse(s);process.stdout.write(j.scrape_id||"")' "$CREATE_RESP")"

curl -sS -X GET "https://run.xcrawl.com/v1/scrape/${SCRAPE_ID}" \
  -H "Authorization: Bearer ${API_KEY}"

Node

node -e '
const fs=require("fs");
const apiKey=JSON.parse(fs.readFileSync(process.env.HOME+"/.xcrawl/config.json","utf8")).XCRAWL_API_KEY;
const body={url:"https://example.com",mode:"sync",output:{formats:["markdown","json"]},json:{prompt:"Extract title and publish date."}};
fetch("https://run.xcrawl.com/v1/scrape",{
  method:"POST",
  headers:{"Content-Type":"application/json",Authorization:`Bearer ${apiKey}`},
  body:JSON.stringify(body)
}).then(async r=>{console.log(await r.text());});
'

Request Parameters

Request endpoint and headers

Endpoint: POST https://run.xcrawl.com/v1/scrape
Headers:
Content-Type: application/json
Authorization: Bearer <api_key>

Request body: top-level fields

Field	Type	Required	Default	Description
`url`	string	Yes	-	Target URL
`mode`	string	No	`sync`	`sync` or `async`
`proxy`	object	No	-	Proxy config
`request`	object	No	-	Request config
`js_render`	object	No	-	JS rendering config
`output`	object	No	-	Output config
`webhook`	object	No	-	Async webhook config (`mode=async`)

`proxy`

Field	Type	Required	Default	Description
`location`	string	No	`US`	ISO-3166-1 alpha-2 country code, e.g. `US` / `JP` / `SG`
`sticky_session`	string	No	Auto-generated	Sticky session ID; same ID attempts to reuse exit

`request`

Field	Type	Required	Default	Description
`locale`	string	No	`en-US,en;q=0.9`	Affects `Accept-Language`
`device`	string	No	`desktop`	`desktop` / `mobile`; affects UA and viewport
`cookies`	object map	No	-	Cookie key/value pairs
`headers`	object map	No	-	Header key/value pairs
`only_main_content`	boolean	No	`true`	Return main content only
`block_ads`	boolean	No	`true`	Attempt to block ad resources
`skip_tls_verification`	boolean	No	`true`	Skip TLS verification

`js_render`

Field	Type	Required	Default	Description
`enabled`	boolean	No	`true`	Enable browser rendering
`wait_until`	string	No	`load`	`load` / `domcontentloaded` / `networkidle`
`viewport.width`	integer	No	-	Viewport width (desktop `1920`, mobile `402`)
`viewport.height`	integer	No	-	Viewport height (desktop `1080`, mobile `874`)

`output`

Field	Type	Required	Default	Description
`formats`	string[]	No	`["markdown"]`	Output formats
`screenshot`	string	No	`viewport`	`full_page` / `viewport` (only if `formats` includes `screenshot`)
`json.prompt`	string	No	-	Extraction prompt
`json.json_schema`	object	No	-	JSON Schema

output.formats enum:

html
raw_html
markdown
links
summary
screenshot
json

`webhook`

Field	Type	Required	Default	Description
`url`	string	No	-	Callback URL
`headers`	object map	No	-	Custom callback headers
`events`	string[]	No	`["started","completed","failed"]`	Events: `started` / `completed` / `failed`

Response Parameters

Sync create response (`mode=sync`)

Field	Type	Description
`scrape_id`	string	Task ID
`endpoint`	string	Always `scrape`
`version`	string	Version
`status`	string	`completed` / `failed`
`url`	string	Target URL
`data`	object	Result data
`started_at`	string	Start time (ISO 8601)
`ended_at`	string	End time (ISO 8601)
`total_credits_used`	integer	Total credits used

data fields (based on output.formats):

html, raw_html, markdown, links, summary, screenshot, json
metadata (page metadata)
traffic_bytes
credits_used
credits_detail

credits_detail fields:

Field	Type	Description
`base_cost`	integer	Base scrape cost
`traffic_cost`	integer	Traffic cost
`json_extract_cost`	integer	JSON extraction cost

Async create response (`mode=async`)

Field	Type	Description
`scrape_id`	string	Task ID
`endpoint`	string	Always `scrape`
`version`	string	Version
`status`	string	Always `pending`

Async result response (`GET /v1/scrape/{scrape_id}`)

Field	Type	Description
`scrape_id`	string	Task ID
`endpoint`	string	Always `scrape`
`version`	string	Version
`status`	string	`pending` / `crawling` / `completed` / `failed`
`url`	string	Target URL
`data`	object	Same shape as sync `data`
`started_at`	string	Start time (ISO 8601)
`ended_at`	string	End time (ISO 8601)

Workflow

Restate the user goal as an extraction contract.

URL scope, required fields, accepted nulls, and precision expectations.

Build the scrape request body.

Keep only necessary options.
Prefer explicit output.formats.

Execute scrape and capture task metadata.

Track scrape_id, status, and timestamps.
If async, poll until completed or failed.

Return raw API responses directly.

Do not synthesize or compress fields by default.

Output Contract

Return:

Endpoint(s) used and mode (sync or async)
request_payload used for the request
Raw response body from each API call
Error details when request fails

Do not generate summaries unless the user explicitly requests a summary.

Guardrails

Do not invent unsupported output fields.
Do not hardcode provider-specific tool schemas in core logic.
Call out uncertainty when page structure is unstable.

Comments

Loading comments...

XCrawl Scrape

Install

Install with OpenClaw

CLI Commands

Runtime requirements

XCrawl Scrape

Overview

Required Local Config

Credits and Account Setup

Tool Permission Policy

API Surface

Usage Examples

cURL (sync)

cURL (async create + result)

Node

Request Parameters

Request endpoint and headers

Request body: top-level fields

proxy

request

js_render

output

webhook

Response Parameters

Sync create response (mode=sync)

Async create response (mode=async)

Async result response (GET /v1/scrape/{scrape_id})

Workflow

Output Contract

Guardrails

Comments

`proxy`

`request`

`js_render`

`output`

`webhook`

Sync create response (`mode=sync`)

Async create response (`mode=async`)

Async result response (`GET /v1/scrape/{scrape_id}`)