Yum NoteBook

Other

Yum NoteBook — a local-first NotebookLM alternative for AI agents, built for real source capture. Ingest a web URL, YouTube video, or screenshot, then generate (1) AI summary, (2) dual-host talk-show MP3 (edge-tts), (3) slide deck (image/table/bullet/flowchart) and optionally upload artifacts to OneDrive/Google Drive/S3/etc. and post a notification to Slack/Discord/Teams. Language and cloud destination are user-configurable; defaults to English with male+female English hosts. Use when: yumnb, study notes, summarize this link, turn this video into notes, help me understand this article.

Install

openclaw skills install yumnb

Yum NoteBook (yumnb) — Source → Summary + Talk-show + Slides

A local-first NotebookLM alternative for AI agents, built for real source capture.

What It Does

Given a source (URL / YouTube / screenshot / raw text), creates one folder per request under <output_dir>/<YYYYMMDD-HHMM-slug>/ containing:

  1. source/ — raw material (downloaded HTML, transcript, screenshot copy …)
  2. summary.md — AI summary (one-liner / key points / facts / takeaways)
  3. talkshow.txt + talkshow.mp3 — dual-host script + MP3 (edge-tts)
  4. deck.pptx — slide deck (bullets, tables, flow, images, summary)
  5. links.json — record of what was generated and any share links

This “one request = one folder” model is intentional: many such folders can accumulate into a local knowledge base that can be read, searched, narrated, and presented later.

If a webhook is configured, a notification is posted to Slack / Discord / Teams Workflow. If deliver.provider is configured, yumnb can also push the finished outputs directly to an IM/chat surface through OpenClaw / Hermes (Telegram / Discord / Teams / Slack / etc.).

You can position yumnb as a local-first, polite alternative to NotebookLM: it keeps notebooks and generated artifacts as ordinary local files by default, then only uploads or delivers them if you explicitly configure that.

Two Ways to Run

A. Fully-automatic (built-in AI provider)

Configure ai.provider in config.yaml (openai / anthropic / gemini / ollama / cli) and run:

python -m yumnb auto "<URL or path>" [--title "short name"]

This runs ingest → AI summary → AI slide-plan → TTS → PPT → publish in one shot.

B. Step-by-step (agent-driven — you bring your own LLM)

If you're driving this from an agent CLI (GitHub Copilot CLI, Claude Code, Cursor, Aider, …), set ai.provider: none and call the subcommands individually. The agent reads the source, writes summary.md and deck.json itself, then asks yumnb to render TTS / PPT / publish.

# 1) Pull raw material
python -m yumnb ingest "<URL>" [--title "..."]
#    → prints the note folder path

# 2) (Agent writes <folder>/summary.md following the schema in README)

# 3) Render dual-host MP3 from a talkshow script the agent wrote
python -m yumnb tts "<folder>/talkshow.txt" --output "<folder>/talkshow.mp3"

# 4) Render PPT from a deck.json the agent wrote
python -m yumnb ppt "<folder>/deck.json" --output "<folder>/deck.pptx"

# 5) Finalize + optional webhook / IM delivery
python -m yumnb publish "<folder>"

Schemas the Agent Writes

summary.md

# <title>

> **Source**: <url or file>
> **Type**: youtube|url|image|text
> **Length**: <duration or word count>

## 🎯 One-line summary

## 📌 Key points (3-5)

## 🔑 Facts / data

## 💡 Takeaways

## 🤔 Open questions

deck.json (rendered to deck.pptx)

{
  "title": "Deck title",
  "subtitle": "Source / date",
  "slides": [
    {"type": "title",      "title": "...", "subtitle": "..."},
    {"type": "bullets",    "title": "...", "bullets": ["...", "..."]},
    {"type": "table",      "title": "...", "headers": ["A","B"], "rows": [["1","2"]]},
    {"type": "flow",       "title": "...", "steps": ["Step 1","Step 2","Step 3"]},
    {"type": "image",      "title": "...", "image_path": "/abs/path.png", "caption": "..."},
    {"type": "two_column", "title": "...", "left": "bullet text", "image_path": "..."},
    {"type": "summary",    "title": "...", "text": "..."}
  ]
}

Recommended: 5–12 slides — title, 1-2 overview, 3-6 main (bullets/table/ flow/image), 1 summary. Reuse images from source/ (YouTube thumbnail, HTML hero image, original screenshot).

talkshow.txt

Lines tagged with [<SpeakerName>] where <SpeakerName> matches a voice configured in config.yamltts.voices. Example:

[HostA] Welcome to the show — today we're chewing on…
[HostB] And by chewing I mean ruthlessly mocking, right?
[HostA] Pretty much.

Prerequisites

  • Python 3.9+
  • Preferred first-run: ./scripts/bootstrap.sh
  • Or manual: pip install -r requirements.txt
  • Plus the AI SDK matching your provider (only one): openai / anthropic / google-generativeai / ollama — or none if you use provider: cli / none.

Notes

  • edge-tts uses Microsoft's free online voices. No API key required.
  • The intro/outro jingle is generated procedurally in pure Python — no external assets bundled.
  • YouTube ingest order is: yt-dlp manual subtitles → yt-dlp auto subtitles → youtube-transcript-api fallback → description-only fallback.
  • This skill carries no platform/tenant/organization-specific defaults. All endpoints and credentials come from config.yaml or environment variables (YUMNB_*, OPENAI_API_KEY, ANTHROPIC_API_KEY, etc.).
  • Direct IM delivery is channel-agnostic: configure deliver.provider: openclaw (or hermes) plus deliver.openclaw.channel + target to send the finished note to Telegram, Discord, Teams, Slack, and other supported surfaces via the local bridge.

Language

config.yamllanguage (default en). Influences both AI prompt language and the default TTS voice pair. Override per-run with --language en|zh|ja|es|fr|de|....

Built-in default voice pairs (male + female, edge-tts):

languageHostA (male)HostB (female)
enen-US-AndrewNeuralen-US-AvaNeural
zhzh-CN-YunyangNeural 云飞zh-CN-XiaoxiaoNeural 小晓
jaja-JP-KeitaNeuralja-JP-NanamiNeural
eses-ES-AlvaroNeurales-ES-ElviraNeural
frfr-FR-HenriNeuralfr-FR-DeniseNeural
dede-DE-ConradNeuralde-DE-KatjaNeural

Override any pair (or add new languages) under tts.language_voices. Setting tts.voices directly always wins.

Cloud upload (OneDrive / Google Drive / S3 / Dropbox / …)

config.yamlupload.provider: rclone makes publish upload the generated mp3 / pptx / summary to your configured cloud and inline the shareable URLs in links.json and the notification payload — so users get one-click mp3 + ppt links instead of local file:// URIs.

upload:
  provider: rclone
  rclone:
    remote: "onedrive:yumnb"   # or gdrive:yumnb, s3:bucket/yumnb, etc.
    share: true

Set it up once with rclone config (see https://rclone.org). yumnb delegates everything to rclone so the same skill works with every backend rclone supports.