RSS Aggregator

v1.0.0

Monitor, filter, and summarize RSS/Atom feeds on a schedule. Use when: (1) tracking industry news or competitor blogs, (2) setting up keyword alerts across m...

0· 60·0 current·0 all-time
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description match the contents: SKILL.md and scripts/fetch_feeds.py implement feed fetching, filtering, summarization, scheduling recipes, and delivery to webhooks/third‑party services. Required artifacts (a small Python script and feedparser) are proportionate to the claimed purpose.
Instruction Scope
Instructions are focused on fetching feeds and routing summaries. They do direct output to external endpoints (Discord webhooks, Notion, generic webhooks) which is expected for this use case, but that means feed content will be sent to whatever endpoint you configure. Also the agent will fetch arbitrary URLs you schedule — this can be abused to probe internal services (SSRF risk) if untrusted feed URLs are supplied.
Install Mechanism
No install spec; instruction-only plus a small Python file. The only dependency is 'feedparser' suggested via pip, which is a normal, traceable Python package. No downloads from untrusted URLs or archive extraction are present.
Credentials
The skill requires no environment variables or credentials. Recipes reference webhooks and third‑party services (Discord, Notion) but those are delivered via scheduling payloads or separate integrations rather than baked into the skill—this is proportionate. There are no requests for unrelated secrets.
Persistence & Privilege
always:false and no install-time modifications to other skills or system-wide settings. disable-model-invocation is default (agent may invoke autonomously), which is expected; there are no elevated persistence requests.
Assessment
This skill appears to do exactly what it says, but take these precautions before installing: (1) Only configure trusted webhook URLs and third‑party integrations (Discord, Notion) because feed contents will be posted to them. Do not paste production tokens into example code; store them securely in your scheduler/integration settings. (2) Be cautious about which feed URLs you schedule — the agent will fetch them and could be used to probe internal-only endpoints (SSRF risk) if run in an environment with internal network access. Limit scheduled feeds to known, public sources. (3) Review any cron/schedule payloads you add so they don't include sensitive data in plain text. (4) If you want stricter control, avoid allowing autonomous runs or run the skill in an isolated environment. If you want me to, I can show how to adapt the script to validate feed hostnames and to avoid posting to unknown endpoints.

Like a lobster shell, security has layers — review code before you run it.

automationvk973axdb05z685ynk3fv5ze1gh84p4mwfeedvk973axdb05z685ynk3fv5ze1gh84p4mwlatestvk973axdb05z685ynk3fv5ze1gh84p4mwmonitoringvk973axdb05z685ynk3fv5ze1gh84p4mwnewsvk973axdb05z685ynk3fv5ze1gh84p4mw
60downloads
0stars
1versions
Updated 1w ago
v1.0.0
MIT-0

RSS Aggregator

Monitor RSS/Atom feeds on a schedule, filter by keywords or date, and route summaries to your preferred channel.

Setup

Requires the feedparser Python package:

pip install feedparser

Core Script

Save as scripts/fetch_feeds.py:

#!/usr/bin/env python3
"""RSS/Atom feed fetcher with filtering and summarization."""
import feedparser
import sys
import json
from datetime import datetime, timedelta
from pathlib import Path

def parse_date(entry):
    """Extract publication date from entry."""
    for field in ('published_parsed', 'updated_parsed', 'created_parsed'):
        if hasattr(entry, field) and entry.get(field):
            return datetime(*entry[field][:6])
    return None

def fetch_feed(url, max_age_days=None, keyword_filter=None):
    """Fetch and filter feed entries."""
    feed = feedparser.parse(url)
    entries = feed.entries

    # Filter by age
    if max_age_days:
        cutoff = datetime.now() - timedelta(days=max_age_days)
        entries = [e for e in entries if parse_date(e) and parse_date(e) >= cutoff]

    # Filter by keyword
    if keyword_filter:
        kw_lower = keyword_filter.lower()
        entries = [e for e in entries if kw_lower in (e.get('title', '') + e.get('summary', '')).lower()]

    return {
        'title': feed.feed.get('title', url),
        'url': url,
        'entries': [
            {
                'title': e.get('title', 'No title'),
                'link': e.get('link', ''),
                'published': parse_date(e).isoformat() if parse_date(e) else None,
                'summary': e.get('summary', e.get('description', ''))[:500]
            }
            for e in entries
        ]
    }

if __name__ == '__main__':
    url = sys.argv[1] if len(sys.argv) > 1 else ''
    max_age = int(sys.argv[2]) if len(sys.argv) > 2 else None
    keyword = sys.argv[3] if len(sys.argv) > 3 else None

    if not url:
        print(json.dumps({'error': 'URL required'}))
        sys.exit(1)

    result = fetch_feed(url, max_age, keyword)
    print(json.dumps(result, indent=2))

Recipes

Recipe 1: Daily News Digest

cron_add(
  name="Tech news digest",
  schedule={"kind": "cron", "expr": "0 8 * * 1-5", "tz": "Africa/Johannesburg"},
  payload={
    "kind": "agentTurn",
    "message": "Run: python scripts/fetch_feeds.py https://news.ycombinator.com/rss 7. Then summarize the top 5 stories as a clean bullet list with titles and links."
  },
  delivery={"mode": "announce"},
  sessionTarget="isolated"
)

Recipe 2: Multi-Feed Monitoring

// First, create scripts/multi_fetch.py:
"""
import feedparser, json, sys
from scripts.fetch_feeds import fetch_feed

feeds = [
    "https://techcrunch.com/feed/",
    "https://www.theverge.com/rss/index.xml",
    "https://feeds.feedburner.com/TechCrunch/"
]

results = [fetch_feed(url, max_age_days=1) for url in feeds]
print(json.dumps(results, indent=2))
"""

Then schedule:

cron_add(
  name="Industry pulse",
  schedule={"kind": "cron", "expr": "0 */6 * * *", "tz": "UTC"},
  payload={
    "kind": "agentTurn",
    "message": "Run: python scripts/multi_fetch.py. Filter entries from last 6 hours. Post new articles to #news channel on Discord with title + link."
  },
  delivery={"mode": "announce"},
  sessionTarget="isolated"
)

Recipe 3: Keyword Alert

cron_add(
  name="AI keyword alert",
  schedule={"kind": "cron", "expr": "0 */4 * * *", "tz": "UTC"},
  payload={
    "kind": "agentTurn",
    "message": "Run: python scripts/fetch_feeds.py https://feeds.feedburner.com/venturebeat/Settings 1 \"AI OR machine learning OR LLM\". If results have entries, format as: **Alert** [Article Title](URL). Send to Discord #alerts channel."
  },
  delivery={"mode": "webhook", "to": "https://discord.com/api/webhooks/..."},
  sessionTarget="isolated"
)

Recipe 4: Feed Status Health Check

cron_add(
  name="Feed health check",
  schedule={"kind": "cron", "expr": "0 9 * * *", "tz": "UTC"},
  payload={
    "kind": "agentTurn",
    "message": "Check if these feeds are still live: Hacker News (https://news.ycombinator.com/rss), TechCrunch (https://techcrunch.com/feed/). Run fetch without filters. If any feed returns 0 entries or error, alert via webhook."
  },
  delivery={"mode": "announce"},
  sessionTarget="isolated",
  failureAlert={"after": 3, "mode": "announce", "cooldownMs": 86400000}
)

Recipe 5: Feed to Read Later (Notion)

cron_add(
  name="RSS to Notion",
  schedule={"kind": "cron", "expr": "0 7 * * *", "tz": "Africa/Johannesburg"},
  payload={
    "kind": "agentTurn",
    "message": "Run: python scripts/fetch_feeds.py https://example.com/rss 1. Create Notion page for each entry in your Reading List database with title, link, and summary as page content."
  },
  delivery={"mode": "none"},
  sessionTarget="isolated"
)

Managing Feeds

# Test a feed directly
python scripts/fetch_feeds.py <feed-url> [max-age-days] [keyword-filter]

# Example
python scripts/fetch_feeds.py https://news.ycombinator.com/rss 7
python scripts/fetch_feeds.py https://techcrunch.com/feed/ 1 "AI"

Feed Discovery

Find RSS feeds on any website by:

  • Adding /feed or /rss to the URL
  • Checking the page source for <link rel="alternate" type="application/rss+xml">
  • Using site:rss search on Google

Common feed URLs:

  • YouTube: https://www.youtube.com/feeds/videos.xml?channel_id=CHANNEL_ID
  • Twitter/X: No native RSS — use替他 (Nitter for Twitter lists)
  • Reddit: https://www.reddit.com/r/SUBREDDIT.rss (requires auth for full content)

Troubleshooting

SymptomCauseFix
Empty entries listFeed may require auth or be XML-onlyTry curl to inspect raw feed
decode error in feedMalformed encodingAdd , encoding='utf-8' to feedparser.parse()
Unicode errorsNon-UTF8 charactersAdd , response_encoding='utf-8' to parse call
Old entries onlymax_age_days too restrictiveIncrease or remove the filter
Missing summariesSite blocks feed scrapersUse e.get('content', [{}])[0].get('value', '') for full content

See Also

  • fuzzy-cron-scheduler skill — scheduling recurring feed checks
  • notion-integration skill — storing articles in Notion
  • discord skill — routing articles to Discord channels
  • webhook-automation skill — HTTP delivery to any endpoint

Comments

Loading comments...