Install
openclaw skills install yumnbYum NoteBook — a local-first NotebookLM alternative for AI agents, built for real source capture. Ingest a web URL, YouTube video, or screenshot, then generate (1) AI summary, (2) dual-host talk-show MP3 (edge-tts), (3) slide deck (image/table/bullet/flowchart) and optionally upload artifacts to OneDrive/Google Drive/S3/etc. and post a notification to Slack/Discord/Teams. Language and cloud destination are user-configurable; defaults to English with male+female English hosts. Use when: yumnb, study notes, summarize this link, turn this video into notes, help me understand this article.
openclaw skills install yumnbA local-first NotebookLM alternative for AI agents, built for real source capture.
Given a source (URL / YouTube / screenshot / raw text), creates one folder per
request under <output_dir>/<YYYYMMDD-HHMM-slug>/ containing:
source/ — raw material (downloaded HTML, transcript, screenshot copy …)summary.md — AI summary (one-liner / key points / facts / takeaways)talkshow.txt + talkshow.mp3 — dual-host script + MP3 (edge-tts)deck.pptx — slide deck (bullets, tables, flow, images, summary)links.json — record of what was generated and any share linksThis “one request = one folder” model is intentional: many such folders can accumulate into a local knowledge base that can be read, searched, narrated, and presented later.
If a webhook is configured, a notification is posted to Slack / Discord /
Teams Workflow. If deliver.provider is configured, yumnb can also push the
finished outputs directly to an IM/chat surface through OpenClaw / Hermes
(Telegram / Discord / Teams / Slack / etc.).
You can position yumnb as a local-first, polite alternative to NotebookLM: it keeps notebooks and generated artifacts as ordinary local files by default, then only uploads or delivers them if you explicitly configure that.
Configure ai.provider in config.yaml (openai / anthropic / gemini /
ollama / cli) and run:
python -m yumnb auto "<URL or path>" [--title "short name"]
This runs ingest → AI summary → AI slide-plan → TTS → PPT → publish in one shot.
If you're driving this from an agent CLI (GitHub Copilot CLI, Claude Code,
Cursor, Aider, …), set ai.provider: none and call the subcommands
individually. The agent reads the source, writes summary.md and
deck.json itself, then asks yumnb to render TTS / PPT / publish.
# 1) Pull raw material
python -m yumnb ingest "<URL>" [--title "..."]
# → prints the note folder path
# 2) (Agent writes <folder>/summary.md following the schema in README)
# 3) Render dual-host MP3 from a talkshow script the agent wrote
python -m yumnb tts "<folder>/talkshow.txt" --output "<folder>/talkshow.mp3"
# 4) Render PPT from a deck.json the agent wrote
python -m yumnb ppt "<folder>/deck.json" --output "<folder>/deck.pptx"
# 5) Finalize + optional webhook / IM delivery
python -m yumnb publish "<folder>"
summary.md# <title>
> **Source**: <url or file>
> **Type**: youtube|url|image|text
> **Length**: <duration or word count>
## 🎯 One-line summary
## 📌 Key points (3-5)
## 🔑 Facts / data
## 💡 Takeaways
## 🤔 Open questions
deck.json (rendered to deck.pptx){
"title": "Deck title",
"subtitle": "Source / date",
"slides": [
{"type": "title", "title": "...", "subtitle": "..."},
{"type": "bullets", "title": "...", "bullets": ["...", "..."]},
{"type": "table", "title": "...", "headers": ["A","B"], "rows": [["1","2"]]},
{"type": "flow", "title": "...", "steps": ["Step 1","Step 2","Step 3"]},
{"type": "image", "title": "...", "image_path": "/abs/path.png", "caption": "..."},
{"type": "two_column", "title": "...", "left": "bullet text", "image_path": "..."},
{"type": "summary", "title": "...", "text": "..."}
]
}
Recommended: 5–12 slides — title, 1-2 overview, 3-6 main (bullets/table/
flow/image), 1 summary. Reuse images from source/ (YouTube thumbnail,
HTML hero image, original screenshot).
talkshow.txtLines tagged with [<SpeakerName>] where <SpeakerName> matches a voice
configured in config.yaml → tts.voices. Example:
[HostA] Welcome to the show — today we're chewing on…
[HostB] And by chewing I mean ruthlessly mocking, right?
[HostA] Pretty much.
./scripts/bootstrap.shpip install -r requirements.txtopenai / anthropic /
google-generativeai / ollama — or none if you use provider: cli /
none.edge-tts uses Microsoft's free online voices. No API key required.youtube-transcript-api fallback → description-only fallback.config.yaml or environment
variables (YUMNB_*, OPENAI_API_KEY, ANTHROPIC_API_KEY, etc.).deliver.provider: openclaw (or hermes) plus deliver.openclaw.channel + target to send
the finished note to Telegram, Discord, Teams, Slack, and other supported
surfaces via the local bridge.config.yaml → language (default en). Influences both AI prompt
language and the default TTS voice pair. Override per-run with
--language en|zh|ja|es|fr|de|....
Built-in default voice pairs (male + female, edge-tts):
language | HostA (male) | HostB (female) |
|---|---|---|
en | en-US-AndrewNeural | en-US-AvaNeural |
zh | zh-CN-YunyangNeural 云飞 | zh-CN-XiaoxiaoNeural 小晓 |
ja | ja-JP-KeitaNeural | ja-JP-NanamiNeural |
es | es-ES-AlvaroNeural | es-ES-ElviraNeural |
fr | fr-FR-HenriNeural | fr-FR-DeniseNeural |
de | de-DE-ConradNeural | de-DE-KatjaNeural |
Override any pair (or add new languages) under tts.language_voices.
Setting tts.voices directly always wins.
config.yaml → upload.provider: rclone makes publish upload the
generated mp3 / pptx / summary to your configured cloud and inline the
shareable URLs in links.json and the notification payload — so users
get one-click mp3 + ppt links instead of local file:// URIs.
upload:
provider: rclone
rclone:
remote: "onedrive:yumnb" # or gdrive:yumnb, s3:bucket/yumnb, etc.
share: true
Set it up once with rclone config (see https://rclone.org). yumnb
delegates everything to rclone so the same skill works with every
backend rclone supports.