Minutes Taker

Other

Convert meeting transcripts into structured minutes with decision tracking, task assignment, three-tier summaries, and cross-meeting correlation.

Install

openclaw skills install minutes-taker

AI Meeting Minutes (minutes-taker)

Convert meeting transcripts or text into structured, actionable minutes. The skill extracts decisions, action items, risks, and key dates, generates three-tier summaries (one-liner → three-para → full), and tracks decisions and todos across meetings.

Core strengths: Decision chain tracking across meetings, todo lifecycle management (extract → assign → track → remind), and three-level summaries for different consumption contexts.

Audio support: ASR (speech-to-text) is available via plugin backends. See ASR Backends below.

Quick Start

clawhub run minutes-taker --input-type text \
  --content "Zhang: Today we discuss the payment module. Li: I suggest using Stripe. Wang: Agreed. Zhang: OK, Li to deliver the proposal by Friday."

This produces structured minutes with decisions, todos, and a three-level summary.

Features

Feature	Description
Input Modes	Plain text, chat logs, structured forms. Audio via ASR backend (see below)
ASR (Speech-to-Text)	Pluggable: whisper (offline) or SpeechRecognition (Google API). No diarization by default
6 Extractors	Decisions, Todos, Dates, Risks, Ideas, Data Points
3-Level Summaries	L1 one-liner, L2 three-paragraph, L3 full minutes
Decision Chain	Track decisions across meetings with evolution history
Todo Lifecycle	Auto-extract, assign, deadline parse, priority infer, track across meetings
Cross-meeting Links	Detect recurring topics, link related decisions and todos
Multiple Outputs	Markdown, text, HTML, feishu, notion (via extensions)

ASR Backends

The asr.py module auto-detects available backends at runtime:

Backend	Quality	Offline	Requirements
whisper (openai-whisper)	⭐⭐⭐ High	✅ Yes	`pip install openai-whisper`
speech_recognition (Google API)	⭐⭐ Medium	❌ No	`pip install SpeechRecognition` (pre-installed)

Transcription is attempted in priority order: whisper → speech_recognition. Speaker diarization is NOT performed; the system labels all text as a single speaker. For multi-speaker audio, provide a participants list or use post-processing.

Input Format

{
  "input": {
    "type": "text",
    "content": "Zhang: Today we discuss...",
    "format": "chat_log"
  },
  "meeting_context": {
    "title": "Q2 Product Roadmap Review",
    "date": "2026-06-14",
    "participants": [
      {"name": "Zhang San", "role": "PM"},
      {"name": "Li Si", "role": "Frontend Dev"}
    ],
    "agenda": ["Payment module", "Growth tools"]
  },
  "options": {
    "summary_level": "full",
    "extract_todos": true,
    "extract_decisions": true,
    "output_format": "markdown"
  }
}

Sample Prompts

Prompt 1: Text-based Meeting Minutes (Quick Start)

clawhub run minutes-taker --input-type text \
  --content "Zhang: Today we discuss the payment module. Li: I suggest using Stripe. Wang: Agreed. Zhang: OK, Li to deliver the proposal by Friday." \
  --title "Payment Module Discussion" \
  --participants "Zhang San (PM), Li Si (FE), Wang Wu (BE)"

Expected output: Structured minutes with:

📌 Summary: Payment module discussion → Stripe chosen, Li to deliver
✅ Todos: Li Si - deliver Stripe proposal by Friday (🔴 High)
📊 Decisions: Use Stripe SDK (Zhang proposed, unanimous)
L1: "Decided to use Stripe for payment module; Li to submit proposal by Friday"
L2/L3: Full expanded minutes

Prompt 2: Audio File Processing

clawhub run minutes-taker --input ./meeting-2026-06-14.m4a \
  --title "Q2 Product Roadmap Review" \
  --participants @team.json

Expected output: Transcribed text with structured minutes (decisions, todos, risks, summary). Requires an ASR backend (see ASR Backends). If no backend is available, returns a clear error guiding installation.

Prompt 3: Todo Tracking

clawhub run minutes-taker todos --since last_meeting

Expected output:

📋 Previous Meeting Todo Tracking (2026-06-07)
✅ Complete (3/5):
  ✅ Li Si · Payment frontend tech proposal (PR #2341 submitted)
⏳ In Progress (1/5):
  ⏳ Zhao Liu · Growth tool prototype (60%, due 06/16)
❌ Overdue (1/5):
  ❌ Zhang San · Competitive analysis report (3 days overdue)

Prompt 4: Decision Chain

clawhub run minutes-taker decisions --topic "Payment Module"

Expected output:

📜 Decision Chain: Payment Module Refactor
06/01 Weekly: "We need to refactor payment module" (Zhang)
06/07 Tech Review: Chose Stripe SDK (Wang proposed, 3:2 passed)
06/14 Roadmap Review: Launch date set to July 20, added stress test → [This meeting]

Prompt 5: Three-Level Summary

clawhub run minutes-taker --input-type text --content "$(cat transcript.txt)" \
  --summary-level three_para --output-format text

Expected output: Concise three-paragraph summary ready for WeChat/email.

First-Success Path

Goal: Structured minutes from 3 lines of dialogue within 30 seconds.

Step 1: clawhub install minutes-taker
Step 2: clawhub run minutes-taker --input-type text \
  --content "Zhang: Today we discuss payment. Li: I suggest Stripe. Wang: OK. Zhang: Li, make proposal by Friday."
Step 3: Internal pipeline:
  a. input.py parses text, identifies speakers
  b. segmenter.py (single topic, no split needed)
  c. extractor.py extracts: 1 decision, 1 todo, 0 risks
  d. summarizer.py generates L1/L2/L3 summaries
  e. formatter.py renders Markdown
Step 4: User sees structured minutes with decision + todo
Step 5: Next step: try audio → requires ASR backend (see ASR Backends section)

Architecture

minutes-taker/
├── SKILL.md
├── scripts/
│   ├── asr.py             # ASR audio transcription (pluggable backends)
│   ├── input.py           # Input parsing (audio/text/chat)
│   ├── segmenter.py       # Topic segmentation
│   ├── extractor.py       # Decision/todo/risk/idea/data extraction
│   ├── summarizer.py      # Three-level summary generation
│   ├── decision_chain.py  # Cross-meeting decision tracking
│   ├── todo_tracker.py    # Todo lifecycle management
│   ├── formatter.py       # Minutes formatting
│   └── storage.py         # Local minutes storage & retrieval
└── references/
    └── examples.json       # Sample inputs/outputs

Pipeline

Input (text/audio/chat)
    │
    ▼
input.py ──► Parsed Input (content + speakers)
    │
    ▼
segmenter.py ──► Topic Segments
    │
    ▼
extractor.py ──► Decisions, Todos, Risks, Ideas, Dates, Data
    │
    ▼
summarizer.py ──► L1, L2, L3 Summaries
    │
    ▼
formatter.py ──► Markdown / Text / HTML
    │
    ▼
storage.py ──► Saved to ~/.openclaw/data/minutes-taker/

Error Handling

Code	Scenario	Action
E001	Audio file not found	Error + path check
E002	Unsupported audio format	List supported formats + convert via ffmpeg
E003	ASR processing failure	Error + offer manual text input; list available backends
E004	LLM timeout	Fall back to basic template
E005	LLM format error	Retry 1x, then return raw text
E006	Export API unavailable	Save locally + retry hint
E007	History data unavailable	Skip cross-refs, generate current
E008	Empty participants list	Auto-detect from content

Security

ASR backends vary: whisper is fully offline (no network upload); Google Speech API sends audio to Google servers
Privacy notice: When using speech_recognition backend, audio data is transmitted to Google for transcription
Local storage: Minutes stored at ~/.openclaw/data/minutes-taker/ with configurable directory
File permissions: Minutes files default to 600 (user-only read/write)
Sensitive detection: Marks potential sensitive content (salary, HR topics) with ⚠️
LLM context splitting: Sends meeting text by topic segments, not the full transcript at once

Dependencies

Python 3.10+
ffmpeg (for audio conversion)
Optional: openai-whisper for local offline ASR
Pre-installed: SpeechRecognition for Google Speech API ASR