Apify
Run and manage Apify Actors via REST API to scrape websites, crawl pages, extract data, and retrieve results from Apify datasets and key-value stores.
Like a lobster shell, security has layers — review code before you run it.
License
Runtime requirements
SKILL.md
Apify
Run any of the 17,000+ Actors on Apify Store and retrieve structured results via the REST API.
Full OpenAPI spec: openapi.json
Authentication
All requests need the APIFY_TOKEN env var. Use it as a Bearer token:
-H "Authorization: Bearer $APIFY_TOKEN"
Base URL: https://api.apify.com
Core workflow
1. Find the right Actor
Search the Apify Store by keyword:
curl -s "https://api.apify.com/v2/store?search=web+scraper&limit=5" \
-H "Authorization: Bearer $APIFY_TOKEN" | jq '.data.items[] | {name: (.username + "/" + .name), title, description}'
Actors are identified by username~name (tilde) in API paths, e.g. apify~web-scraper.
2. Get Actor README and input schema
Before running an Actor, fetch its default build to get the README (usage docs) and input schema (expected JSON fields):
curl -s "https://api.apify.com/v2/acts/apify~web-scraper/builds/default" \
-H "Authorization: Bearer $APIFY_TOKEN" | jq '.data | {readme, inputSchema}'
inputSchema is a JSON-stringified object — parse it to see required/optional fields, types, defaults, and descriptions. Use this to construct valid input for the run.
You can also get the Actor's per-build OpenAPI spec (no auth required):
curl -s "https://api.apify.com/v2/acts/apify~web-scraper/builds/default/openapi.json"
3. Run an Actor (async — recommended for most cases)
Start the Actor and get the run object back immediately:
curl -s -X POST "https://api.apify.com/v2/acts/apify~web-scraper/runs" \
-H "Authorization: Bearer $APIFY_TOKEN" \
-H "Content-Type: application/json" \
-d '{"startUrls":[{"url":"https://example.com"}],"maxPagesPerCrawl":10}'
Response includes data.id (run ID), data.defaultDatasetId, data.status.
Optional query params: ?timeout=300&memory=4096&maxItems=100&waitForFinish=60
waitForFinish(0-60): seconds the API waits before returning. Useful to avoid polling for short runs.
4. Poll run status
curl -s "https://api.apify.com/v2/actor-runs/RUN_ID?waitForFinish=60" \
-H "Authorization: Bearer $APIFY_TOKEN" | jq '.data | {status, defaultDatasetId}'
Terminal statuses: SUCCEEDED, FAILED, ABORTED, TIMED-OUT.
5. Get results
Dataset items (most common — structured scraped data):
curl -s "https://api.apify.com/v2/datasets/DATASET_ID/items?clean=true&limit=100" \
-H "Authorization: Bearer $APIFY_TOKEN"
Or directly from the run (shortcut — same parameters):
curl -s "https://api.apify.com/v2/actor-runs/RUN_ID/dataset/items?clean=true&limit=100" \
-H "Authorization: Bearer $APIFY_TOKEN"
Params: format (json|csv|jsonl|xml|xlsx|rss), fields, omit, limit, offset, clean, desc.
Key-value store record (screenshots, HTML, OUTPUT):
curl -s "https://api.apify.com/v2/key-value-stores/STORE_ID/records/OUTPUT" \
-H "Authorization: Bearer $APIFY_TOKEN"
Run log:
curl -s "https://api.apify.com/v2/logs/RUN_ID" \
-H "Authorization: Bearer $APIFY_TOKEN"
6. Run Actor synchronously (short-running Actors only)
For Actors that finish within 300 seconds, get dataset items in one call:
curl -s -X POST "https://api.apify.com/v2/acts/apify~web-scraper/run-sync-get-dataset-items?timeout=120" \
-H "Authorization: Bearer $APIFY_TOKEN" \
-H "Content-Type: application/json" \
-d '{"startUrls":[{"url":"https://example.com"}],"maxPagesPerCrawl":5}'
Returns the dataset items array directly (not wrapped in data). Returns 408 if the run exceeds 300s.
Alternative: /run-sync returns the KVS OUTPUT record instead of dataset items.
Quick recipes
Scrape a website
curl -s -X POST "https://api.apify.com/v2/acts/apify~web-scraper/run-sync-get-dataset-items?timeout=120" \
-H "Authorization: Bearer $APIFY_TOKEN" \
-H "Content-Type: application/json" \
-d '{"startUrls":[{"url":"https://example.com"}],"maxPagesPerCrawl":20}'
Google search
curl -s -X POST "https://api.apify.com/v2/acts/apify~google-search-scraper/run-sync-get-dataset-items?timeout=120" \
-H "Authorization: Bearer $APIFY_TOKEN" \
-H "Content-Type: application/json" \
-d '{"queries":"site:example.com openai","maxPagesPerQuery":1}'
Long-running Actor (async with polling)
# 1. Start
RUN=$(curl -s -X POST "https://api.apify.com/v2/acts/apify~web-scraper/runs?waitForFinish=60" \
-H "Authorization: Bearer $APIFY_TOKEN" \
-H "Content-Type: application/json" \
-d '{"startUrls":[{"url":"https://example.com"}],"maxPagesPerCrawl":500}')
RUN_ID=$(echo "$RUN" | jq -r '.data.id')
# 2. Poll until done
while true; do
STATUS=$(curl -s "https://api.apify.com/v2/actor-runs/$RUN_ID?waitForFinish=60" \
-H "Authorization: Bearer $APIFY_TOKEN" | jq -r '.data.status')
echo "Status: $STATUS"
case "$STATUS" in SUCCEEDED|FAILED|ABORTED|TIMED-OUT) break;; esac
done
# 3. Fetch results
curl -s "https://api.apify.com/v2/actor-runs/$RUN_ID/dataset/items?clean=true" \
-H "Authorization: Bearer $APIFY_TOKEN"
Abort a run
curl -s -X POST "https://api.apify.com/v2/actor-runs/RUN_ID/abort" \
-H "Authorization: Bearer $APIFY_TOKEN"
Paid / rental Actors
Some Actors require a monthly subscription before they can be run. If the API returns a permissions or payment error for an Actor, ask the user to manually subscribe via the Apify Console:
https://console.apify.com/actors/ACTOR_ID
Replace ACTOR_ID with the Actor's ID (e.g. AhEsMsQyLfHyMLaxz). The user needs to click Start on that page to activate the subscription. Most rental Actors offer a free trial period set by the developer.
You can get the Actor ID from the store search response (data.items[].id) or from GET /v2/acts/username~name (data.id).
Error handling
- 401:
APIFY_TOKENmissing or invalid. - 404 Actor not found: check
username~nameformat (tilde, not slash). Browse https://apify.com/store. - 400 run-failed: check
GET /v2/logs/RUN_IDfor details. - 402/403 payment required: the Actor likely requires a subscription. See "Paid / rental Actors" above.
- 408 run-timeout-exceeded: sync endpoints have a 300s limit. Use async workflow instead.
- 429 rate-limit-exceeded: retry with exponential backoff (start at 500ms, double each time).
Additional resources
- API docs (LLM-friendly): https://docs.apify.com/api/v2.md
- OpenAPI spec: openapi.json
- Apify Store (browse Actors): https://apify.com/store
Files
2 totalComments
Loading comments…
