Yum NoteBook

Other

Yum NoteBook — a local-first NotebookLM alternative for AI agents, built for real source capture. Ingest a web URL, YouTube video, or screenshot, then generate (1) AI summary, (2) dual-host talk-show MP3 (edge-tts), (3) slide deck (image/table/bullet/flowchart) and optionally upload artifacts to OneDrive/Google Drive/S3/etc. and post a notification to Slack/Discord/Teams. Language and cloud destination are user-configurable; defaults to English with male+female English hosts. Use when: yumnb, study notes, summarize this link, turn this video into notes, help me understand this article.

Install

openclaw skills install yumnb

Yum NoteBook (yumnb) — Source → Summary + Talk-show + Slides

A local-first NotebookLM alternative for AI agents, built for real source capture.

What It Does

Given a source (URL / YouTube / screenshot / raw text), creates one folder per request under <output_dir>/<YYYYMMDD-HHMM-slug>/ containing:

source/ — raw material (downloaded HTML, transcript, screenshot copy …)
summary.md — AI summary (one-liner / key points / facts / takeaways)
talkshow.txt + talkshow.mp3 — dual-host script + MP3 (edge-tts)
deck.pptx — slide deck (bullets, tables, flow, images, summary)
links.json — record of what was generated and any share links

This “one request = one folder” model is intentional: many such folders can accumulate into a local knowledge base that can be read, searched, narrated, and presented later.

If a webhook is configured, a notification is posted to Slack / Discord / Teams Workflow. If deliver.provider is configured, yumnb can also push the finished outputs directly to an IM/chat surface through OpenClaw / Hermes (Telegram / Discord / Teams / Slack / etc.).

You can position yumnb as a local-first, polite alternative to NotebookLM: it keeps notebooks and generated artifacts as ordinary local files by default, then only uploads or delivers them if you explicitly configure that.

Two Ways to Run

A. Fully-automatic (built-in AI provider)

Configure ai.provider in config.yaml (openai / anthropic / gemini / ollama / cli) and run:

python -m yumnb auto "<URL or path>" [--title "short name"]

This runs ingest → AI summary → AI slide-plan → TTS → PPT → publish in one shot.

B. Step-by-step (agent-driven — you bring your own LLM)

If you're driving this from an agent CLI (GitHub Copilot CLI, Claude Code, Cursor, Aider, …), set ai.provider: none and call the subcommands individually. The agent reads the source, writes summary.md and deck.json itself, then asks yumnb to render TTS / PPT / publish.

# 1) Pull raw material
python -m yumnb ingest "<URL>" [--title "..."]
#    → prints the note folder path

# 2) (Agent writes <folder>/summary.md following the schema in README)

# 3) Render dual-host MP3 from a talkshow script the agent wrote
python -m yumnb tts "<folder>/talkshow.txt" --output "<folder>/talkshow.mp3"

# 4) Render PPT from a deck.json the agent wrote
python -m yumnb ppt "<folder>/deck.json" --output "<folder>/deck.pptx"

# 5) Finalize + optional webhook / IM delivery
python -m yumnb publish "<folder>"

Schemas the Agent Writes

`summary.md`

# <title>

> **Source**: <url or file>
> **Type**: youtube|url|image|text
> **Length**: <duration or word count>

## 🎯 One-line summary

## 📌 Key points (3-5)

## 🔑 Facts / data

## 💡 Takeaways

## 🤔 Open questions

`deck.json` (rendered to `deck.pptx`)

{
  "title": "Deck title",
  "subtitle": "Source / date",
  "slides": [
    {"type": "title",      "title": "...", "subtitle": "..."},
    {"type": "bullets",    "title": "...", "bullets": ["...", "..."]},
    {"type": "table",      "title": "...", "headers": ["A","B"], "rows": [["1","2"]]},
    {"type": "flow",       "title": "...", "steps": ["Step 1","Step 2","Step 3"]},
    {"type": "image",      "title": "...", "image_path": "/abs/path.png", "caption": "..."},
    {"type": "two_column", "title": "...", "left": "bullet text", "image_path": "..."},
    {"type": "summary",    "title": "...", "text": "..."}
  ]
}

Recommended: 5–12 slides — title, 1-2 overview, 3-6 main (bullets/table/ flow/image), 1 summary. Reuse images from source/ (YouTube thumbnail, HTML hero image, original screenshot).

`talkshow.txt`

Lines tagged with [<SpeakerName>] where <SpeakerName> matches a voice configured in config.yaml → tts.voices. Example:

[HostA] Welcome to the show — today we're chewing on…
[HostB] And by chewing I mean ruthlessly mocking, right?
[HostA] Pretty much.

Prerequisites

Python 3.9+
Preferred first-run: ./scripts/bootstrap.sh
Or manual: pip install -r requirements.txt
Plus the AI SDK matching your provider (only one): openai / anthropic / google-generativeai / ollama — or none if you use provider: cli / none.

Notes

edge-tts uses Microsoft's free online voices. No API key required.
The intro/outro jingle is generated procedurally in pure Python — no external assets bundled.
YouTube ingest order is: yt-dlp manual subtitles → yt-dlp auto subtitles → youtube-transcript-api fallback → description-only fallback.
This skill carries no platform/tenant/organization-specific defaults. All endpoints and credentials come from config.yaml or environment variables (YUMNB_*, OPENAI_API_KEY, ANTHROPIC_API_KEY, etc.).
Direct IM delivery is channel-agnostic: configure deliver.provider: openclaw (or hermes) plus deliver.openclaw.channel + target to send the finished note to Telegram, Discord, Teams, Slack, and other supported surfaces via the local bridge.

Language

config.yaml → language (default en). Influences both AI prompt language and the default TTS voice pair. Override per-run with --language en|zh|ja|es|fr|de|....

Built-in default voice pairs (male + female, edge-tts):

`language`	HostA (male)	HostB (female)
`en`	`en-US-AndrewNeural`	`en-US-AvaNeural`
`zh`	`zh-CN-YunyangNeural` 云飞	`zh-CN-XiaoxiaoNeural` 小晓
`ja`	`ja-JP-KeitaNeural`	`ja-JP-NanamiNeural`
`es`	`es-ES-AlvaroNeural`	`es-ES-ElviraNeural`
`fr`	`fr-FR-HenriNeural`	`fr-FR-DeniseNeural`
`de`	`de-DE-ConradNeural`	`de-DE-KatjaNeural`

Override any pair (or add new languages) under tts.language_voices. Setting tts.voices directly always wins.

Cloud upload (OneDrive / Google Drive / S3 / Dropbox / …)

config.yaml → upload.provider: rclone makes publish upload the generated mp3 / pptx / summary to your configured cloud and inline the shareable URLs in links.json and the notification payload — so users get one-click mp3 + ppt links instead of local file:// URIs.

upload:
  provider: rclone
  rclone:
    remote: "onedrive:yumnb"   # or gdrive:yumnb, s3:bucket/yumnb, etc.
    share: true

Set it up once with rclone config (see https://rclone.org). yumnb delegates everything to rclone so the same skill works with every backend rclone supports.