Spotify News Digest Skill
Overview
This skill automatically aggregates Spotify-related news from 9+ sources (official Spotify blogs + media + community), deduplicates articles, ranks them by relevance, and delivers a formatted Chinese digest — one-sentence summary per article with the original link.
Key Features:
- 🎵 Covers Spotify's full content ecosystem (Engineering, Newsroom, Research, Design)
- 📰 Pulls media coverage from TechCrunch, The Verge, Music Business Worldwide, Forbes
- 💬 Includes Hacker News community discussions via Algolia API
- 🔍 DDG News search fallback ensures coverage even when RSS is restricted
- 🧹 Title-similarity deduplication (threshold: 0.65)
- 📊 Multi-factor ranking: source authority + recency + community score
- 🤖 One-sentence Chinese summaries generated by LLM at render time
- ⏰ Configurable time range (hours or days)
⚠️ Security Notes
Read this before scheduling or running in any environment with internal network access.
- TLS verification is enabled. The fetcher uses
verify=True on all HTTP requests. Do not add any ssl._create_default_https_context overrides.
- Domain allowlist enforced for search results. DDG News results are filtered through
ALLOWED_DDG_DOMAINS (defined at the top of scripts/fetch_spotify_news.py). Only well-known public news domains pass; internal hostnames are rejected. Review and edit that list before running in sensitive environments.
- RSS sources are explicit and fixed. The base sources in
config/sources.json are official Spotify feeds and a few trusted media outlets. Do not add intranet or metadata-service URLs to sources.json.
- Run in isolation when scheduling. If you plan to schedule this skill, run it in a container or VM that has no access to internal services or secrets. A mis-configured DDG result that slips through would otherwise reach your internal network.
- Audit pip dependencies before installing in production:
feedparser, beautifulsoup4, requests, python-dateutil, ddgs.
- Cron delivery scope. When you ask OpenClaw to schedule this skill, it creates an isolated
agentTurn cron job. Confirm the target channel is a group/recipient you intend before confirming the job.
Quick Start
One-Time Digest (Last 24 Hours)
cd /projects/.openclaw/skills/spotify-news-digest
python3 scripts/generate_digest.py
Last 7 Days (Broader Coverage)
python3 scripts/generate_digest.py --hours 168
Save to Markdown File
python3 scripts/generate_digest.py --hours 24 --output /tmp/spotify_digest.md
News Sources
| Source | Type | URL / Method | Category |
|---|
| Spotify Engineering Blog | RSS | engineering.atspotify.com/feed/ | official |
| Spotify Newsroom | RSS | newsroom.spotify.com/feed/ | official |
| Spotify Research | RSS | research.atspotify.com/feed/ | research |
| Spotify Design | RSS | spotify.design/feed | design |
| TechCrunch Spotify | RSS | techcrunch.com/tag/spotify/feed/ | media |
| The Verge (filtered) | RSS | Verge full feed + keyword filter | media |
| Hacker News Spotify | Algolia API | hn.algolia.com query=spotify | community |
| DDG News Search | DDGS API | 5 queries × 8 results | media/official |
Note: Music Business Worldwide and Billboard RSS feeds have high latency and are disabled by default (_disabled: true in sources.json). Their articles are captured via DDG News Search instead.
Output Format
Each digest is grouped by category with one-sentence Chinese summaries:
🎵 Spotify 新闻日报 · YYYY-MM-DD
共 N 条(去重后)
─────────────────────────────
🎵 官方动态(N 条)
· [一句话中文总结](Source Name)
🔗 https://...
📰 媒体报道(N 条)
· [一句话中文总结](Source Name)
🔗 https://...
🔬 技术研究(N 条)
· ...
─────────────────────────────
🤖 由 OpenClaw · spotify-news-digest 自动生成
Category Labels
| Category Key | Display Label |
|---|
official | 🎵 官方动态 |
research | 🔬 技术研究 |
design | 🎨 产品设计 |
media | 📰 媒体报道 |
community | 💬 社区讨论 |
industry | 🏭 行业资讯 |
Usage Examples
1. As Python Module
import sys
sys.path.insert(0, '/projects/.openclaw/skills/spotify-news-digest/scripts')
from fetch_spotify_news import SpotifyNewsFetcher
from process_spotify_news import SpotifyNewsProcessor, format_digest
# Fetch articles from the last 48 hours
articles = SpotifyNewsFetcher().fetch_all(hours=48)
print(f"Fetched {len(articles)} articles")
# Deduplicate, score, and group by category
result = SpotifyNewsProcessor().process(articles, max_output=20)
# Render Markdown digest
md = format_digest(result, date_str='2026-03-17')
print(md)
2. LLM-Enhanced Summaries (Recommended)
The format_digest() function outputs [English Title] placeholders when no zh_summary is set on an article. The calling LLM should:
- Run
generate_digest.py to get the raw structured result
- For each article, read
title + summary and generate a one-sentence Chinese summary
- Set
item['zh_summary'] before calling format_digest()
This two-step flow keeps the scraping fast and the summarization accurate.
# Example: LLM fills zh_summary before formatting
for cat_items in result.values():
for item in cat_items:
item['zh_summary'] = llm_summarize_zh(item['title'], item['summary'])
digest = format_digest(result)
3. Scheduled Daily Digest (Cron via OpenClaw)
Ask OpenClaw to set up a daily cron job:
"每天上午 10 点发一份 Spotify 新闻日报到 [群/频道]"
OpenClaw will create an isolated agentTurn cron job that:
- Runs this skill
- Generates Chinese summaries with LLM
- Posts the digest to the specified channel
Configuration
Add / Remove Sources
Edit config/sources.json:
{
"sources": [
{
"name": "My Custom Source",
"type": "rss",
"url": "https://example.com/feed/",
"language": "en",
"category": "media",
"keyword_filter": "spotify"
}
],
"settings": {
"max_news_per_source": 15,
"final_output_count": 20,
"similarity_threshold": 0.65,
"timeout": 12,
"keyword_filter_default": "spotify"
}
}
keyword_filter: When set, only articles containing this keyword (case-insensitive) in title or summary are included. Leave empty for official Spotify feeds that are already Spotify-only.
_disabled: true: Mark a source to skip it at runtime without deleting it.
Security: Only add public news domains to sources.json. Do not add intranet, VPN, or cloud metadata URLs — they could be fetched directly via RSS without domain-allowlist filtering.
Extend the DDG Domain Allowlist
If you add sources that are reachable via DDG News search (not RSS), also add their domain to ALLOWED_DDG_DOMAINS near the top of scripts/fetch_spotify_news.py:
ALLOWED_DDG_DOMAINS: tuple = (
'atspotify.com',
'techcrunch.com',
...
'your-new-domain.com', # ← add here
)
Tune Deduplication
In scripts/process_spotify_news.py:
processor = SpotifyNewsProcessor(similarity_threshold=0.65)
# 0.5 = aggressive dedup | 0.8 = loose dedup
Tune Source Authority Weights
In process_spotify_news.py, edit source_weight:
source_weight = {
'Spotify Engineering Blog': 90,
'Spotify Newsroom': 80,
'TechCrunch': 60,
# Add your custom sources here
}
Command-Line Reference
python3 scripts/generate_digest.py [OPTIONS]
Options:
--hours N Time range in hours (default: 24)
--max N Max articles to output after dedup (default: 20)
--output PATH Save Markdown digest to file
python3 scripts/fetch_spotify_news.py [OPTIONS]
Options:
--hours N Fetch articles published within last N hours
Troubleshooting
No articles fetched (0 条)
Most likely a network restriction on RSS endpoints.
# Test direct RSS access
curl -I https://engineering.atspotify.com/feed/
# Test DDG search fallback
python3 -c "
from ddgs import DDGS
with DDGS() as d:
r = list(d.news('spotify new feature', max_results=3, timelimit='w'))
print(r)
"
If RSS is blocked but DDG works, the skill will still return results via the search fallback. This is normal in restricted network environments.
ddgs not installed
pip3 install ddgs
duckduckgo-search (older package name) is also supported as a fallback.
Slow fetch / timeout
Sources with slow RSS feeds are marked _disabled: true in sources.json. To disable an additional slow source:
{ "name": "...", "_disabled": true, ... }
Too many duplicate articles
Lower the similarity threshold:
SpotifyNewsProcessor(similarity_threshold=0.5)
Articles missing Chinese summary
The format_digest() function wraps untranslated titles in [brackets]. This is intentional — the LLM caller should fill zh_summary before rendering. See Usage Examples → LLM-Enhanced Summaries above.
File Structure
spotify-news-digest/
├── SKILL.md ← You are here
├── config/
│ └── sources.json ← News source definitions & settings
├── scripts/
│ ├── fetch_spotify_news.py ← Multi-source fetcher (RSS + DDG)
│ ├── process_spotify_news.py ← Dedup, scoring, formatting
│ └── generate_digest.py ← CLI entry point (fetch → process → print)
└── references/
└── (reserved for future API reference docs)
Dependencies
pip3 install feedparser beautifulsoup4 requests python-dateutil ddgs
| Package | Purpose |
|---|
feedparser | RSS feed parsing |
beautifulsoup4 | HTML summary cleanup |
requests | HTTP requests for RSS |
python-dateutil | Robust date parsing |
ddgs | DuckDuckGo News search fallback |
Python: 3.8+
Design Notes
- Why DDG News over site-specific scrapers? RSS feeds from Spotify blogs occasionally return 0 results due to network restrictions. DDG News search provides a reliable fallback that requires no authentication and respects
timelimit for recency filtering.
- Why one-sentence Chinese summaries? The target audience (WeChat Work groups) prefers dense, scannable content. English titles are preserved in the source data so users can verify accuracy.
- Why not store summaries? Summaries are generated at render time by the calling LLM, keeping the skill stateless and reproducible.
Changelog
| Version | Date | Changes |
|---|
| 1.1.0 | 2026-07-14 | Security: remove SSL bypass, add domain allowlist for DDG results, explicit verify=True on all requests, updated security guidance |
| 1.0.0 | 2026-03-17 | Initial release: RSS + DDG, 9 sources, category grouping, Chinese summary support |