Apify

Run and manage Apify Actors via REST API to scrape websites, crawl pages, extract data, and retrieve results from Apify datasets and key-value stores.

MIT-0 · Free to use, modify, and redistribute. No attribution required.
5 · 2k · 18 current installs · 18 all-time installs
MIT-0
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
The skill's name/description (run/manage Apify Actors) align with the declared primary credential (APIFY_TOKEN) and the use of curl/wget to call api.apify.com. Requesting a single Apify API token is expected for this functionality.
Instruction Scope
SKILL.md stays within the Apify API workflow (search store, fetch builds, start runs, poll runs, fetch datasets/KVS). Example commands use jq for JSON processing and show network calls only to api.apify.com and target websites, but jq is used in examples while not declared in the skill's required binaries — users should ensure jq is available or adapt commands. Instructions do not ask for unrelated files or other credentials.
Install Mechanism
No install spec and no code files — instruction-only skill. This is low-risk: nothing will be downloaded or written to disk by the skill itself.
Credentials
Only APIFY_TOKEN is required and it is the correct, proportional credential for calling the Apify API. The SKILL.md uses only that env var; it does not request unrelated secrets or system config paths.
Persistence & Privilege
always: false (not force-included) and normal autonomous invocation settings. The skill does not request elevated or persistent system privileges and does not modify other skills' configs.
Assessment
This skill appears to be what it says: an instruction-only wrapper that runs Apify Actors via the official API and needs your Apify API token. Before installing: (1) Only provide APIFY_TOKEN if you trust the skill — it grants full API access to your Apify account; consider a token with least privilege if possible. (2) Note that example commands use jq for JSON parsing but jq is not declared as a required binary — ensure jq is installed or remove those commands. (3) Be aware the skill will cause network calls to api.apify.com and to whatever websites the Actors crawl; scraped results may contain sensitive data and could trigger legal/robot policy issues for some targets. (4) Some Actors are paid/rented — the skill may surface permission/payment errors and will not bypass billing. If you need the skill to run autonomously, consider the implications of giving the agent access to a live API token and the ability to start runs that may incur cost.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.0.3
Download zip
latestvk973dktte9vg35x923kw1fh89n815fff

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Runtime requirements

🐝 Clawdis
Any bincurl, wget
EnvAPIFY_TOKEN
Primary envAPIFY_TOKEN

SKILL.md

Apify

Run any of the 17,000+ Actors on Apify Store and retrieve structured results via the REST API.

Full OpenAPI spec: openapi.json

Authentication

All requests need the APIFY_TOKEN env var. Use it as a Bearer token:

-H "Authorization: Bearer $APIFY_TOKEN"

Base URL: https://api.apify.com

Core workflow

1. Find the right Actor

Search the Apify Store by keyword:

curl -s "https://api.apify.com/v2/store?search=web+scraper&limit=5" \
  -H "Authorization: Bearer $APIFY_TOKEN" | jq '.data.items[] | {name: (.username + "/" + .name), title, description}'

Actors are identified by username~name (tilde) in API paths, e.g. apify~web-scraper.

2. Get Actor README and input schema

Before running an Actor, fetch its default build to get the README (usage docs) and input schema (expected JSON fields):

curl -s "https://api.apify.com/v2/acts/apify~web-scraper/builds/default" \
  -H "Authorization: Bearer $APIFY_TOKEN" | jq '.data | {readme, inputSchema}'

inputSchema is a JSON-stringified object — parse it to see required/optional fields, types, defaults, and descriptions. Use this to construct valid input for the run.

You can also get the Actor's per-build OpenAPI spec (no auth required):

curl -s "https://api.apify.com/v2/acts/apify~web-scraper/builds/default/openapi.json"

3. Run an Actor (async — recommended for most cases)

Start the Actor and get the run object back immediately:

curl -s -X POST "https://api.apify.com/v2/acts/apify~web-scraper/runs" \
  -H "Authorization: Bearer $APIFY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"startUrls":[{"url":"https://example.com"}],"maxPagesPerCrawl":10}'

Response includes data.id (run ID), data.defaultDatasetId, data.status.

Optional query params: ?timeout=300&memory=4096&maxItems=100&waitForFinish=60

  • waitForFinish (0-60): seconds the API waits before returning. Useful to avoid polling for short runs.

4. Poll run status

curl -s "https://api.apify.com/v2/actor-runs/RUN_ID?waitForFinish=60" \
  -H "Authorization: Bearer $APIFY_TOKEN" | jq '.data | {status, defaultDatasetId}'

Terminal statuses: SUCCEEDED, FAILED, ABORTED, TIMED-OUT.

5. Get results

Dataset items (most common — structured scraped data):

curl -s "https://api.apify.com/v2/datasets/DATASET_ID/items?clean=true&limit=100" \
  -H "Authorization: Bearer $APIFY_TOKEN"

Or directly from the run (shortcut — same parameters):

curl -s "https://api.apify.com/v2/actor-runs/RUN_ID/dataset/items?clean=true&limit=100" \
  -H "Authorization: Bearer $APIFY_TOKEN"

Params: format (json|csv|jsonl|xml|xlsx|rss), fields, omit, limit, offset, clean, desc.

Key-value store record (screenshots, HTML, OUTPUT):

curl -s "https://api.apify.com/v2/key-value-stores/STORE_ID/records/OUTPUT" \
  -H "Authorization: Bearer $APIFY_TOKEN"

Run log:

curl -s "https://api.apify.com/v2/logs/RUN_ID" \
  -H "Authorization: Bearer $APIFY_TOKEN"

6. Run Actor synchronously (short-running Actors only)

For Actors that finish within 300 seconds, get dataset items in one call:

curl -s -X POST "https://api.apify.com/v2/acts/apify~web-scraper/run-sync-get-dataset-items?timeout=120" \
  -H "Authorization: Bearer $APIFY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"startUrls":[{"url":"https://example.com"}],"maxPagesPerCrawl":5}'

Returns the dataset items array directly (not wrapped in data). Returns 408 if the run exceeds 300s.

Alternative: /run-sync returns the KVS OUTPUT record instead of dataset items.

Quick recipes

Scrape a website

curl -s -X POST "https://api.apify.com/v2/acts/apify~web-scraper/run-sync-get-dataset-items?timeout=120" \
  -H "Authorization: Bearer $APIFY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"startUrls":[{"url":"https://example.com"}],"maxPagesPerCrawl":20}'

Google search

curl -s -X POST "https://api.apify.com/v2/acts/apify~google-search-scraper/run-sync-get-dataset-items?timeout=120" \
  -H "Authorization: Bearer $APIFY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"queries":"site:example.com openai","maxPagesPerQuery":1}'

Long-running Actor (async with polling)

# 1. Start
RUN=$(curl -s -X POST "https://api.apify.com/v2/acts/apify~web-scraper/runs?waitForFinish=60" \
  -H "Authorization: Bearer $APIFY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"startUrls":[{"url":"https://example.com"}],"maxPagesPerCrawl":500}')
RUN_ID=$(echo "$RUN" | jq -r '.data.id')

# 2. Poll until done
while true; do
  STATUS=$(curl -s "https://api.apify.com/v2/actor-runs/$RUN_ID?waitForFinish=60" \
    -H "Authorization: Bearer $APIFY_TOKEN" | jq -r '.data.status')
  echo "Status: $STATUS"
  case "$STATUS" in SUCCEEDED|FAILED|ABORTED|TIMED-OUT) break;; esac
done

# 3. Fetch results
curl -s "https://api.apify.com/v2/actor-runs/$RUN_ID/dataset/items?clean=true" \
  -H "Authorization: Bearer $APIFY_TOKEN"

Abort a run

curl -s -X POST "https://api.apify.com/v2/actor-runs/RUN_ID/abort" \
  -H "Authorization: Bearer $APIFY_TOKEN"

Paid / rental Actors

Some Actors require a monthly subscription before they can be run. If the API returns a permissions or payment error for an Actor, ask the user to manually subscribe via the Apify Console:

https://console.apify.com/actors/ACTOR_ID

Replace ACTOR_ID with the Actor's ID (e.g. AhEsMsQyLfHyMLaxz). The user needs to click Start on that page to activate the subscription. Most rental Actors offer a free trial period set by the developer.

You can get the Actor ID from the store search response (data.items[].id) or from GET /v2/acts/username~name (data.id).

Error handling

  • 401: APIFY_TOKEN missing or invalid.
  • 404 Actor not found: check username~name format (tilde, not slash). Browse https://apify.com/store.
  • 400 run-failed: check GET /v2/logs/RUN_ID for details.
  • 402/403 payment required: the Actor likely requires a subscription. See "Paid / rental Actors" above.
  • 408 run-timeout-exceeded: sync endpoints have a 300s limit. Use async workflow instead.
  • 429 rate-limit-exceeded: retry with exponential backoff (start at 500ms, double each time).

Additional resources

Files

2 total
Select a file
Select a file to preview.

Comments

Loading comments…