{"skill":{"slug":"apify","displayName":"Apify","summary":"Run and manage Apify Actors via REST API to scrape websites, crawl pages, extract data, and retrieve results from Apify datasets and key-value stores.","description":"---\nname: apify\ndescription: Run Apify Actors (web scrapers, crawlers, automation tools) and retrieve their results using the Apify REST API with curl. Use when the user wants to scrape a website, extract data from the web, run an Apify Actor, crawl pages, or get results from Apify datasets.\nhomepage: https://docs.apify.com/api/v2\nmetadata:\n  {\n    \"openclaw\":\n      {\n        \"emoji\": \"🐝\",\n        \"primaryEnv\": \"APIFY_TOKEN\",\n        \"requires\": { \"anyBins\": [\"curl\", \"wget\"], \"env\": [\"APIFY_TOKEN\"] },\n      },\n  }\n---\n\n# Apify\n\nRun any of the 17,000+ Actors on [Apify Store](https://apify.com/store) and retrieve structured results via the REST API.\n\nFull OpenAPI spec: [openapi.json](openapi.json)\n\n## Authentication\n\nAll requests need the `APIFY_TOKEN` env var. Use it as a Bearer token:\n\n```bash\n-H \"Authorization: Bearer $APIFY_TOKEN\"\n```\n\nBase URL: `https://api.apify.com`\n\n## Core workflow\n\n### 1. Find the right Actor\n\nSearch the Apify Store by keyword:\n\n```bash\ncurl -s \"https://api.apify.com/v2/store?search=web+scraper&limit=5\" \\\n  -H \"Authorization: Bearer $APIFY_TOKEN\" | jq '.data.items[] | {name: (.username + \"/\" + .name), title, description}'\n```\n\nActors are identified by `username~name` (tilde) in API paths, e.g. `apify~web-scraper`.\n\n### 2. Get Actor README and input schema\n\nBefore running an Actor, fetch its default build to get the README (usage docs) and input schema (expected JSON fields):\n\n```bash\ncurl -s \"https://api.apify.com/v2/acts/apify~web-scraper/builds/default\" \\\n  -H \"Authorization: Bearer $APIFY_TOKEN\" | jq '.data | {readme, inputSchema}'\n```\n\n`inputSchema` is a JSON-stringified object — parse it to see required/optional fields, types, defaults, and descriptions. Use this to construct valid input for the run.\n\nYou can also get the Actor's per-build OpenAPI spec (no auth required):\n\n```bash\ncurl -s \"https://api.apify.com/v2/acts/apify~web-scraper/builds/default/openapi.json\"\n```\n\n### 3. Run an Actor (async — recommended for most cases)\n\nStart the Actor and get the run object back immediately:\n\n```bash\ncurl -s -X POST \"https://api.apify.com/v2/acts/apify~web-scraper/runs\" \\\n  -H \"Authorization: Bearer $APIFY_TOKEN\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"startUrls\":[{\"url\":\"https://example.com\"}],\"maxPagesPerCrawl\":10}'\n```\n\nResponse includes `data.id` (run ID), `data.defaultDatasetId`, `data.status`.\n\nOptional query params: `?timeout=300&memory=4096&maxItems=100&waitForFinish=60`\n\n- `waitForFinish` (0-60): seconds the API waits before returning. Useful to avoid polling for short runs.\n\n### 4. Poll run status\n\n```bash\ncurl -s \"https://api.apify.com/v2/actor-runs/RUN_ID?waitForFinish=60\" \\\n  -H \"Authorization: Bearer $APIFY_TOKEN\" | jq '.data | {status, defaultDatasetId}'\n```\n\nTerminal statuses: `SUCCEEDED`, `FAILED`, `ABORTED`, `TIMED-OUT`.\n\n### 5. Get results\n\n**Dataset items** (most common — structured scraped data):\n\n```bash\ncurl -s \"https://api.apify.com/v2/datasets/DATASET_ID/items?clean=true&limit=100\" \\\n  -H \"Authorization: Bearer $APIFY_TOKEN\"\n```\n\nOr directly from the run (shortcut — same parameters):\n\n```bash\ncurl -s \"https://api.apify.com/v2/actor-runs/RUN_ID/dataset/items?clean=true&limit=100\" \\\n  -H \"Authorization: Bearer $APIFY_TOKEN\"\n```\n\nParams: `format` (`json`|`csv`|`jsonl`|`xml`|`xlsx`|`rss`), `fields`, `omit`, `limit`, `offset`, `clean`, `desc`.\n\n**Key-value store record** (screenshots, HTML, OUTPUT):\n\n```bash\ncurl -s \"https://api.apify.com/v2/key-value-stores/STORE_ID/records/OUTPUT\" \\\n  -H \"Authorization: Bearer $APIFY_TOKEN\"\n```\n\n**Run log:**\n\n```bash\ncurl -s \"https://api.apify.com/v2/logs/RUN_ID\" \\\n  -H \"Authorization: Bearer $APIFY_TOKEN\"\n```\n\n### 6. Run Actor synchronously (short-running Actors only)\n\nFor Actors that finish within 300 seconds, get dataset items in one call:\n\n```bash\ncurl -s -X POST \"https://api.apify.com/v2/acts/apify~web-scraper/run-sync-get-dataset-items?timeout=120\" \\\n  -H \"Authorization: Bearer $APIFY_TOKEN\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"startUrls\":[{\"url\":\"https://example.com\"}],\"maxPagesPerCrawl\":5}'\n```\n\nReturns the dataset items array directly (not wrapped in `data`). Returns `408` if the run exceeds 300s.\n\nAlternative: `/run-sync` returns the KVS `OUTPUT` record instead of dataset items.\n\n## Quick recipes\n\n### Scrape a website\n\n```bash\ncurl -s -X POST \"https://api.apify.com/v2/acts/apify~web-scraper/run-sync-get-dataset-items?timeout=120\" \\\n  -H \"Authorization: Bearer $APIFY_TOKEN\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"startUrls\":[{\"url\":\"https://example.com\"}],\"maxPagesPerCrawl\":20}'\n```\n\n### Google search\n\n```bash\ncurl -s -X POST \"https://api.apify.com/v2/acts/apify~google-search-scraper/run-sync-get-dataset-items?timeout=120\" \\\n  -H \"Authorization: Bearer $APIFY_TOKEN\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"queries\":\"site:example.com openai\",\"maxPagesPerQuery\":1}'\n```\n\n### Long-running Actor (async with polling)\n\n```bash\n# 1. Start\nRUN=$(curl -s -X POST \"https://api.apify.com/v2/acts/apify~web-scraper/runs?waitForFinish=60\" \\\n  -H \"Authorization: Bearer $APIFY_TOKEN\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"startUrls\":[{\"url\":\"https://example.com\"}],\"maxPagesPerCrawl\":500}')\nRUN_ID=$(echo \"$RUN\" | jq -r '.data.id')\n\n# 2. Poll until done\nwhile true; do\n  STATUS=$(curl -s \"https://api.apify.com/v2/actor-runs/$RUN_ID?waitForFinish=60\" \\\n    -H \"Authorization: Bearer $APIFY_TOKEN\" | jq -r '.data.status')\n  echo \"Status: $STATUS\"\n  case \"$STATUS\" in SUCCEEDED|FAILED|ABORTED|TIMED-OUT) break;; esac\ndone\n\n# 3. Fetch results\ncurl -s \"https://api.apify.com/v2/actor-runs/$RUN_ID/dataset/items?clean=true\" \\\n  -H \"Authorization: Bearer $APIFY_TOKEN\"\n```\n\n### Abort a run\n\n```bash\ncurl -s -X POST \"https://api.apify.com/v2/actor-runs/RUN_ID/abort\" \\\n  -H \"Authorization: Bearer $APIFY_TOKEN\"\n```\n\n## Paid / rental Actors\n\nSome Actors require a monthly subscription before they can be run. If the API returns a permissions or payment error for an Actor, ask the user to manually subscribe via the Apify Console:\n\n```\nhttps://console.apify.com/actors/ACTOR_ID\n```\n\nReplace `ACTOR_ID` with the Actor's ID (e.g. `AhEsMsQyLfHyMLaxz`). The user needs to click **Start** on that page to activate the subscription. Most rental Actors offer a free trial period set by the developer.\n\nYou can get the Actor ID from the store search response (`data.items[].id`) or from `GET /v2/acts/username~name` (`data.id`).\n\n## Error handling\n\n- **401**: `APIFY_TOKEN` missing or invalid.\n- **404 Actor not found**: check `username~name` format (tilde, not slash). Browse https://apify.com/store.\n- **400 run-failed**: check `GET /v2/logs/RUN_ID` for details.\n- **402/403 payment required**: the Actor likely requires a subscription. See \"Paid / rental Actors\" above.\n- **408 run-timeout-exceeded**: sync endpoints have a 300s limit. Use async workflow instead.\n- **429 rate-limit-exceeded**: retry with exponential backoff (start at 500ms, double each time).\n\n## Additional resources\n\n- API docs (LLM-friendly): https://docs.apify.com/api/v2.md\n- OpenAPI spec: [openapi.json](openapi.json)\n- Apify Store (browse Actors): https://apify.com/store\n","tags":{"latest":"1.0.3"},"stats":{"comments":0,"downloads":4087,"installsAllTime":30,"installsCurrent":30,"stars":7,"versions":4},"createdAt":1771071767459,"updatedAt":1778489533630},"latestVersion":{"version":"1.0.3","createdAt":1771074519782,"changelog":"- Actor count updated: now supports over 17,000+ Actors (up from 4,000+).\n- All documentation and examples remain unchanged except for the increased Actor availability.","license":null},"metadata":{"setup":[{"key":"APIFY_TOKEN","required":true}],"os":null,"systems":null},"owner":{"handle":"bmestanov","userId":"s177n0vxy2v7amrmkd1rs8xa8n85vexa","displayName":"Bilyal Mestanov","image":"https://avatars.githubusercontent.com/u/17569192?v=4"},"moderation":{"isSuspicious":false,"isMalwareBlocked":false,"verdict":"clean","reasonCodes":["review.llm_review"],"summary":"Review: review.llm_review","engineVersion":"v2.4.24","updatedAt":1779966841183}}