Install
openclaw skills install webscraper-pulpminerConvert any webpage into structured JSON data using AI. Scrape websites, extract data into custom JSON schemas, and call saved APIs programmatically. Useful for web scraping, data extraction, content monitoring, lead generation, price tracking, and building data pipelines.
openclaw skills install webscraper-pulpminerPulpMiner converts any webpage into structured JSON using AI. You provide a URL and optionally a JSON template, and PulpMiner scrapes the page, runs it through an LLM, and returns clean structured data.
All API calls require the apikey header:
apikey: <PULPMINER_API_KEY>
Get your API key from https://pulpminer.com/api — click "Regenerate Key" if you don't have one.
PulpMiner works in two phases:
curl -X GET "https://api.pulpminer.com/external/<apiId>" \
-H "apikey: <PULPMINER_API_KEY>"
Returns JSON extracted from the configured webpage.
For APIs saved with template URLs like https://example.com/search?q={{query}}&page={{page}}:
curl -X POST "https://api.pulpminer.com/external/<apiId>" \
-H "apikey: <PULPMINER_API_KEY>" \
-H "Content-Type: application/json" \
-d '{"query": "javascript frameworks", "page": "1"}'
The {{variable}} placeholders in the saved URL get replaced with the values you provide.
Successful responses return:
{
"data": { ... },
"errors": null
}
Error responses return:
{
"data": null,
"errors": "Error message describing what went wrong"
}
When creating a saved API at https://pulpminer.com/api, you can configure:
| Option | Description |
|---|---|
| URL | The webpage to scrape |
| JSON Template | Optional JSON structure for the LLM to follow (e.g., {"name": "", "price": ""}) |
| Render JS | Enable for SPAs and JS-heavy pages (uses headless browser) |
| CSS Selector | Extract only a specific part of the page (e.g., .product-list, #main-content) |
| Extra Instructions | Additional guidance for the AI (e.g., "Only extract items with prices above $50") |
| Dynamic URL | Enable template variables in the URL with {{variable}} syntax |
| Cache | Toggle response caching on/off |
For async scraping in Zapier workflows:
# Static API
curl -X POST "https://api.pulpminer.com/external/zapier/get/<apiId>" \
-H "apikey: <PULPMINER_API_KEY>" \
-d '{"callbackURL": "https://hooks.zapier.com/..."}'
# Dynamic API
curl -X POST "https://api.pulpminer.com/external/zapier/post/<apiId>" \
-H "apikey: <PULPMINER_API_KEY>" \
-d '{"callbackURL": "https://hooks.zapier.com/...", "query": "value"}'
Returns 201 immediately. Sends scraped data to the callback URL when complete.
Verify authentication:
curl -X GET "https://api.pulpminer.com/external/n8n/auth" \
-H "apikey: <PULPMINER_API_KEY>"
Then use the standard /external/<apiId> endpoints for data fetching.