HN Podcast Archive

Automation

Automate podcast archiving by detecting new HN episodes from RSS, downloading audio, transcribing locally with Whisper, and generating markdown archives with...

Install

openclaw skills install hn-podcast-archive

HN Podcast Archive

Set up or maintain a repeatable pipeline that:

reads an RSS feed,
detects new episodes,
downloads audio,
transcribes with local Whisper,
writes a markdown archive per episode,
updates index/state files.

Workflow

Read references/layout.md to understand the expected archive layout and outputs.
Use scripts/hn_podcast_archive.py as the primary implementation.
Run python3 scripts/hn_podcast_archive.py --help to inspect options.
For first-time setup, ensure required binaries and Python modules exist.
For automation, schedule the script on a recurring cadence with a stable output directory.

Required runtime dependencies

The script expects:

ffmpeg in PATH
whisper in PATH
Python 3.10+
Python package feedparser

If any dependency is missing, surface a clear setup note instead of pretending the pipeline is ready to execute.

Recommended command

python3 skills/hn-podcast-archive/scripts/hn_podcast_archive.py \
  --feed-url "https://example.com/podcast.rss" \
  --output-dir ./data/hn-podcast-archive \
  --whisper-model turbo

Output expectations

For each ingested episode, create:

downloaded audio under audio/
transcript under transcripts/
markdown archive under episodes/

Keep these shared files current:

index.md
state.json
run-log.jsonl

Automation guidance

For automation, prefer a cron/standing-order style trigger that runs every few hours. The script is idempotent at the episode level by tracking processed GUIDs/URLs in state.json.

Safe operating rules

Never overwrite unrelated archive content.
Skip already-processed episodes unless explicitly forced.
Preserve source metadata (title, published date, audio URL, guid).
If transcription fails after download, keep the audio and record the failure in the log/state.

Customization points

Useful flags:

--limit N to ingest only recent items during testing
--force to reprocess already-seen items
--dry-run to inspect actions without writing outputs
--whisper-model to trade speed vs accuracy

Packaging/publishing

Package the skill from its folder. Publish with ClawHub only after local validation passes and authentication is available.

HN Podcast Archive

Install

HN Podcast Archive

Workflow

Required runtime dependencies

Recommended command

Output expectations

Automation guidance

Safe operating rules

Customization points

Packaging/publishing

Related skills