Install
openclaw skills install @futurizerush/apify-google-newsThis skill should be used when the user asks to "scrape Google News", "get news articles", "search for news", "extract news data", "monitor news topics", "get latest headlines", or needs to collect news articles from Google News.
openclaw skills install @futurizerush/apify-google-newsSearch and extract news articles from Google News with full article content, enriched descriptions, and metadata. Supports region and language filtering.
Actor: futurizerush/google-news-scraper
Set APIFY_API_TOKEN in environment. Get a token at console.apify.com/account/integrations.
Apify runs are asynchronous. Every request follows 3 steps:
SUCCEEDEDTypical run time: 30-90 seconds depending on query count and article enrichment.
| Parameter | Type | Required | Description |
|---|---|---|---|
searchQueries | array of strings | Yes | Search queries (e.g. ["AI"], ["climate change"]) |
region | string | No | Region code. Default: "us". Examples: "us", "tw", "jp" |
language | string | No | Language code. Default: "en". Examples: "en", "zh-TW", "ja" |
dateFilter | string | No | Time range: "1h", "1d", "1w", "1m", or "" (any time). Default: "" |
maxResults | integer | No | Max articles per query. Default: 20. Min: 10 |
import requests, os, time
TOKEN = os.environ["APIFY_API_TOKEN"]
BASE = "https://api.apify.com/v2"
# Step 1: Start the run
response = requests.post(
f"{BASE}/acts/futurizerush~google-news-scraper/runs?token={TOKEN}",
json={
"searchQueries": ["AI"],
"region": "us",
"language": "en",
"dateFilter": "1d",
"maxResults": 10,
},
)
response.raise_for_status()
run = response.json()["data"]
run_id = run["id"]
dataset_id = run["defaultDatasetId"]
# Step 2: Poll until done
while True:
status = requests.get(
f"{BASE}/actor-runs/{run_id}?token={TOKEN}"
).json()["data"]["status"]
if status == "SUCCEEDED":
break
if status in ("FAILED", "ABORTED", "TIMED-OUT"):
raise RuntimeError(f"Run failed: {status}")
time.sleep(5)
# Step 3: Fetch results (JSON array)
items = requests.get(
f"{BASE}/datasets/{dataset_id}/items?token={TOKEN}"
).json()
for article in items:
print(f"[{article['source']}] {article['title']}")
print(f" URL: {article['articleUrl']}")
print(f" Published: {article['pubDate']}")
if article.get("enrichedDescription"):
print(f" Summary: {article['enrichedDescription'][:100]}")
requests.post(
f"{BASE}/acts/futurizerush~google-news-scraper/runs?token={TOKEN}",
json={
"searchQueries": ["台灣"],
"region": "tw",
"language": "zh-TW",
"dateFilter": "1d",
"maxResults": 10,
},
)
requests.post(
f"{BASE}/acts/futurizerush~google-news-scraper/runs?token={TOKEN}",
json={
"searchQueries": ["AI", "climate", "crypto"],
"region": "us",
"dateFilter": "1w",
"maxResults": 10,
},
)
# Step 1: Start the run
RUN_RESPONSE=$(curl -s -X POST \
"https://api.apify.com/v2/acts/futurizerush~google-news-scraper/runs?token=$APIFY_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{"searchQueries": ["AI"], "region": "us", "language": "en", "dateFilter": "1d", "maxResults": 10}')
RUN_ID=$(echo "$RUN_RESPONSE" | jq -r '.data.id')
DATASET_ID=$(echo "$RUN_RESPONSE" | jq -r '.data.defaultDatasetId')
# Step 2: Poll until done
while true; do
STATUS=$(curl -s "https://api.apify.com/v2/actor-runs/$RUN_ID?token=$APIFY_API_TOKEN" \
| jq -r '.data.status')
[ "$STATUS" = "SUCCEEDED" ] && break
[ "$STATUS" = "FAILED" ] || [ "$STATUS" = "ABORTED" ] && echo "Failed: $STATUS" && exit 1
sleep 5
done
# Step 3: Fetch results
curl -s "https://api.apify.com/v2/datasets/$DATASET_ID/items?token=$APIFY_API_TOKEN" | jq '.'
Each item in the results array (field names verified from real API output on 2026-04-11):
{
"title": "Vance, Bessent questioned tech giants on AI security...",
"articleUrl": "https://www.cnbc.com/2026/04/10/...",
"googleNewsUrl": "https://news.google.com/rss/articles/...",
"pubDate": "Fri, 10 Apr 2026 20:06:08 GMT",
"timestamp": "2026-04-10T20:06:08.000Z",
"source": "CNBC",
"websiteName": "CNBC",
"websiteUrl": "https://www.cnbc.com",
"imageUrl": "https://image.cnbcfm.com/...",
"description": "Raw RSS description with related headlines...",
"enrichedDescription": "Bessent and Fed Chair Jerome Powell separately met with...",
"excerpt": "Bessent and Fed Chair Jerome Powell separately met with...",
"articleContent": {
"content": "Full article text (truncated to ~2000 chars)...",
"characterCount": 2000,
"tokenCount": 325
},
"enrichmentTime": 9848,
"guid": "unique-article-id",
"searchQuery": "AI",
"region": "us",
"language": "en",
"scrapedAt": "2026-04-11T06:06:59.618Z"
}
Note: Field names use camelCase. The articleContent object contains the full article text (up to ~2000 characters), character count, and token count. Use enrichedDescription or excerpt for summaries.
| Error | Cause | Fix |
|---|---|---|
| 401 Unauthorized | Invalid or missing API token | Check APIFY_API_TOKEN |
invalid-input: "must be >= 10" | maxResults below minimum | Set maxResults to at least 10 |
| No results | Query too specific or region has no news | Broaden the query or try a different region |
dateFilter: "1h" for real-time news monitoring and alerting.dateFilter: "1d" for daily news digests.articleContent.content provides the full article text (up to ~2000 chars) -- useful for summarization.enrichedDescription is a cleaner summary than description (which contains raw RSS data with related headlines).timestamp is ISO 8601 format, easier to parse than pubDate.