Audio Script Writer

Convert written medical content into podcast or video scripts optimized for audio delivery. Transforms academic papers, reports, and educational materials in...

MIT-0 · Free to use, modify, and redistribute. No attribution required.
0 · 51 · 0 current installs · 0 all-time installs
byAIpoch@aipoch-ai
MIT-0
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description, SKILL.md, and the included Python script all implement the same functionality (text preprocessing, style application, pronunciation notes, timing). There are no unrelated environment variables, binaries, or install steps requested.
Instruction Scope
Runtime instructions and examples only reference reading input text (file or stdin), converting it, and writing output. The SKILL.md allowed-tools list (Read, Write, Bash, Edit) aligns with the examples (CLI invocation, file I/O). There are no instructions to access external endpoints, system-wide credentials, or unrelated files/paths.
Install Mechanism
No install spec is provided (instruction-only), and requirements.txt lists only standard Python libraries. The presence of a single benign Python script is expected for this skill; nothing is downloaded from external URLs or installed automatically.
Credentials
The skill declares no required environment variables, credentials, or config paths. The code does not read environment secrets. Requested access is minimal and proportionate to converting files to scripts.
Persistence & Privilege
Skill does not request always:true and is user-invocable; it does not modify other skills or system-wide settings. Autonomous invocation is allowed by default but is not combined with any other concerning privileges here.
Assessment
This skill appears coherent and low-risk, but before installing: (1) review the Python file locally (it is readable and simple) and run it in a sandbox if possible; (2) avoid feeding protected health information (PHI) or patient-identifiable data into any third-party or shared execution environment; (3) limit the agent's allowed tools/permissions if the platform lets you (e.g., avoid granting broad shell access unless needed); and (4) verify outputs for clinical/terminology accuracy before use in patient- or education-facing materials.

Like a lobster shell, security has layers — review code before you run it.

Current versionv0.1.0
Download zip
latestvk9795rm16f6j0ecwj11nc6nkg5835jxk

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

SKILL.md

Audio Script Writer

Overview

Content transformation tool that converts written medical and scientific materials into professionally structured audio scripts suitable for podcasts, educational videos, audiobooks, and voiceover narration.

Key Capabilities:

  • Format Conversion: Research papers → podcast scripts
  • Spoken Word Optimization: Sentence restructuring for listening
  • Pronunciation Guides: Medical terminology phonetic spelling
  • Timing Estimation: Duration calculations for production planning
  • Multi-Format Output: Podcast, video, lecture, audiobook templates
  • Voice Direction: Tone, pace, and emphasis cues for narrators

When to Use

✅ Use this skill when:

  • Creating medical education podcasts from journal articles
  • Converting conference presentations to video scripts
  • Developing audiobook versions of medical textbooks
  • Scripting patient education audio materials
  • Producing research summary videos for social media
  • Adapting written case reports for audio case studies
  • Creating voiceover scripts for e-learning modules

❌ Do NOT use when:

  • Live presentation without script → Use improvisation
  • Highly visual content (surgery videos) → Use visual-focused tools
  • Interactive audio (Q&A format) → Use dialogue scripting tools
  • Music or sound design planning → Use audio production software
  • Voice recording itself → This creates scripts, not audio

Integration:

  • Upstream: abstract-summarizer (content condensation), lay-summary-gen (patient-friendly language)
  • Downstream: medical-translation (multi-language scripts), voice-cloning-tool (AI narration)

Core Capabilities

1. Spoken Word Transformation

Convert written text to conversational audio style:

from scripts.audio_writer import AudioScriptWriter

writer = AudioScriptWriter()

# Transform written content
script = writer.convert_to_audio(
    source_text=research_paper,
    format="podcast",  # podcast, video, lecture, audiobook
    target_audience="medical_students",
    duration_minutes=15
)

print(script.spoken_text)
# Converts: "The pathophysiology of diabetes mellitus involves..."
# To: "So what exactly happens in diabetes? Well, it all starts when..."

Transformation Rules:

Written StyleAudio StyleExample
"Furthermore""Plus"Less formal transitions
" et al.""and their colleagues"Expand abbreviations
Numbers in textSpoken numbers"15%" → "15 percent"
Long sentences15-20 word maxBreak into digestible chunks
Passive voiceActive voice"was observed" → "we saw"
CitationsOmit or footnote"(Smith et al., 2024)" → [reference tone]

2. Pronunciation Guide Generation

Create phonetic spelling for medical terms:

# Generate pronunciation guide
pronunciation = writer.create_pronunciation_guide(
    text=script,
    include_phonetic=True,
    include_syllables=True
)

# Output:
# "Hyperlipidemia: hi-per-lip-i-DEE-mee-uh"
# "Metformin: met-FOR-min"
# "Atherosclerosis: ath-er-oh-skleh-ROH-sis"

Guide Elements:

  • Phonetic Spelling: IPA or simplified phonetics
  • Syllable Breaks: hy-per-ten-sion
  • Emphasis Marking: Primary stress (CAPS), secondary stress
  • Alternative Pronunciations: Regional variations (UK vs US)
  • Sound-Alikes: "rhymes with..." for difficult terms

3. Timing and Pacing

Calculate speaking duration and mark pacing cues:

# Analyze timing
timing = writer.calculate_timing(
    script=script,
    speaking_rate="conversational",  # slow, conversational, fast
    include_pauses=True
)

print(f"Estimated duration: {timing.duration_minutes} minutes")
print(f"Word count: {timing.word_count}")
print(f"Pace: {timing.words_per_minute} WPM")

Speaking Rates:

StyleWPMUse Case
Slow/Educational120-130Patient education, complex topics
Conversational140-160Podcasts, general audience
Fast/News170-190Time-constrained content
VariableVariesDynamic pacing with pauses

Pacing Cues:

[BREATHE] - Brief pause for narrator
[PAUSE 2s] - Two-second pause for emphasis
[SLOW DOWN] - Reduce pace for key point
[SPEED UP] - Increase energy/excitement
[BEAT] - Dramatic pause

4. Multi-Format Templates

Generate scripts for different audio formats:

# Podcast episode
podcast = writer.create_podcast_script(
    content=article,
    episode_format="interview",  # solo, interview, panel
    include_intro_music=True,
    ad_breaks=[5, 12]  # minutes
)

# Educational video
video = writer.create_video_script(
    content=lecture_slides,
    visual_cues=True,  # Mark where visuals change
    b_roll_notes=True  # Suggest supplemental footage
)

Format Types:

FormatCharacteristicsBest For
PodcastConversational, segments, adsLong-form content, interviews
VideoVisual cues, B-roll notesYouTube, educational platforms
LectureStructured, Q&A breaksOnline courses, training
AudiobookChapter markers, consistent toneTextbooks, memoirs
NewsTight, factual, quickResearch briefs, updates

Common Patterns

Pattern 1: Research Paper to Podcast

Scenario: Convert published study to 15-minute podcast episode.

# Convert paper to podcast script
python scripts/main.py \
  --input paper.pdf \
  --format podcast \
  --duration 15 \
  --style conversational \
  --include-intro-outro \
  --output podcast_script.txt

# Generate pronunciation guide
python scripts/main.py \
  --input podcast_script.txt \
  --generate-pronunciation \
  --output pronunciation_guide.txt

Structure:

[INTRO MUSIC 5s]

HOST: Welcome to Medical Research Today. I'm your host...

[BREATHE]

HOST: Today we're diving into a fascinating study about...

[PAUSE]

HOST: So what did the researchers find? Well...

[BREATHE]

HOST: Dr. Smith, one of the study authors, explains...

[SOUND BITE: Interview clip]

...

[OUTRO MUSIC]

Pattern 2: Medical Lecture Recording

Scenario: Convert lecture notes to video script for online course.

# Create lecture script
lecture = writer.create_lecture_script(
    notes=lecture_content,
    duration=45,  # minutes
    break_intervals=[15, 30],  # minutes for student breaks
    interaction_points=True  # "Pause and think..." prompts
)

# Add visual cues
script = writer.add_visual_cues(
    script=lecture,
    slide_transitions=True,
    animation_notes=True
)

Lecture Elements:

  • Learning objectives at start
  • Periodic comprehension checks
  • Break reminders
  • Transition phrases between topics
  • Summary and key takeaways

Pattern 3: Patient Education Audio

Scenario: Create audio guide for diabetes management.

# Patient-friendly script
patient_script = writer.create_patient_script(
    medical_content=diabetes_guide,
    reading_level=6,  # 6th grade
    empathetic_tone=True,
    key_points_highlighted=True
)

# Slow, clear pacing
patient_script.adjust_pacing(
    wpm=130,
    pause_after_sentences=1.5  # seconds
)

Patient Script Features:

  • Simple language (avoid medical jargon)
  • Empathetic tone
  • Clear action steps
  • Reassuring statements
  • Repetition of key points

Pattern 4: Conference Presentation to Video

Scenario: Adapt live presentation to YouTube video format.

# Convert presentation script
python scripts/main.py \
  --input presentation_transcript.txt \
  --format video \
  --platform youtube \
  --include-hooks true \
  --engagement-cues true \
  --output youtube_script.txt

YouTube Optimization:

  • Hook in first 30 seconds
  • Engagement questions for comments
  • Call to action (subscribe, like)
  • Timestamp markers for chapters
  • B-roll suggestions for visual interest

Complete Workflow Example

From research paper to published podcast:

# Step 1: Extract and summarize content
python scripts/main.py \
  --input paper.pdf \
  --extract-key-points \
  --output key_points.txt

# Step 2: Convert to audio script
python scripts/main.py \
  --input key_points.txt \
  --format podcast \
  --duration 20 \
  --output raw_script.txt

# Step 3: Add production elements
python scripts/main.py \
  --input raw_script.txt \
  --add-music-cues \
  --add-sound-effects \
  --add-pacing-marks \
  --output production_script.txt

# Step 4: Generate pronunciation guide
python scripts/main.py \
  --input production_script.txt \
  --generate-pronunciation \
  --output pronunciations.txt

# Step 5: Create timing breakdown
python scripts/main.py \
  --input production_script.txt \
  --calculate-timing \
  --output timing_breakdown.txt

Python API:

from scripts.audio_writer import AudioScriptWriter
from scripts.pronunciation import PronunciationGuide
from scripts.timing import TimingCalculator

# Initialize
writer = AudioScriptWriter()
pronouncer = PronunciationGuide()
timing = TimingCalculator()

# Read source material
with open("research_article.txt", "r") as f:
    content = f.read()

# Step 1: Convert to spoken format
script = writer.convert_to_audio(
    text=content,
    format="podcast",
    target_duration=15,  # minutes
    audience="general_medical"
)

# Step 2: Add production elements
script_with_cues = writer.add_production_cues(
    script=script,
    music_stings=True,
    transition_effects=True
)

# Step 3: Generate pronunciation guide
medical_terms = pronouncer.extract_terms(script_with_cues)
pronunciation_guide = pronouncer.create_guide(medical_terms)

# Step 4: Calculate timing
timing_analysis = timing.calculate(
    script=script_with_cues,
    speaking_rate=150  # WPM
)

# Export complete production package
writer.export_production_package(
    script=script_with_cues,
    pronunciation=pronunciation_guide,
    timing=timing_analysis,
    output_dir="podcast_production/"
)

Quality Checklist

Content Quality:

  • Written content accurate and current
  • Sources cited (even if not spoken)
  • Medical facts verified by expert
  • Appropriate for target audience level
  • No confidential patient information

Audio Optimization:

  • Sentences 15-20 words maximum
  • Abbreviations expanded on first use
  • Complex terms have pronunciation guides
  • Active voice preferred over passive
  • Transitions smooth and conversational

Production Quality:

  • Timing realistic for content density
  • Pacing cues appropriate for subject
  • Music/sound cues marked clearly
  • Pronunciation guide comprehensive
  • Script formatted for easy reading

Before Recording:

  • CRITICAL: Script read aloud for flow
  • Difficult pronunciations practiced
  • Timing tested with stopwatch
  • Technical terms confirmed with subject expert
  • Copyright cleared for any quoted material

Common Pitfalls

Content Issues:

  • Too dense → Information overload for listeners

    • ✅ Break complex topics into multiple episodes
  • Visual dependencies → "As shown in Figure 3..."

    • ✅ Describe visuals or omit visual-dependent content
  • Citation overload → Every sentence has reference

    • ✅ Save citations for show notes, not narration

Audio Issues:

  • Written-style language → "Furthermore, the aforementioned..."

    • ✅ Conversational: "Plus, this thing we talked about..."
  • No pauses → Relentless information delivery

    • ✅ Build in breathing room; let points sink in
  • Ignoring pronunciation → Mispronounced medical terms

    • ✅ Research and practice all technical terms

Production Issues:

  • Underestimating time → 10 minutes of script takes 12+ to record

    • ✅ Add 20% buffer for retakes and natural pacing
  • Complex sentence structures → Tongue twisters for narrator

    • ✅ Short sentences; avoid nested clauses

References

Available in references/ directory:

  • audio_writing_best_practices.md - Broadcast writing guidelines
  • medical_pronunciation_guide.md - Common terms phonetics
  • podcast_production_standards.md - Industry format standards
  • accessibility_guidelines.md - Inclusive audio content
  • platform_requirements.md - YouTube, Spotify, Apple specs
  • voice_care_tips.md - Narrator health and performance

Scripts

Located in scripts/ directory:

  • main.py - CLI interface for script conversion
  • audio_writer.py - Core text-to-audio transformation
  • pronunciation.py - Medical terminology phonetics
  • timing.py - Duration calculation and pacing
  • format_templates.py - Podcast, video, lecture templates
  • voice_direction.py - Narrator cues and direction
  • accessibility.py - Alternative format generation

Limitations

  • Voice Performance: Script is text only; actual delivery varies by narrator
  • Accent Variations: Pronunciation guides may not match all dialects
  • Cultural Context: Humor and references may not translate across cultures
  • Copyright: Cannot use copyrighted material without permission
  • Technical Accuracy: Does not verify medical content (input-dependent)
  • Live Elements: Cannot script unscripted interviews or Q&A

Parameters

ParameterTypeDefaultRequiredDescription
--input, -istring-NoInput text file path
--output, -ostring-NoOutput JSON file path (default: stdout)
--textstring-NoDirect text input (alternative to --input)
--duration, -dint5NoTarget duration in minutes
--pace, -pstringnormalNoSpeaking pace (slow, normal, fast)
--style, -sstringconversationalNoScript style (conversational, formal, educational)

Usage

Basic Usage

# Convert from file
python scripts/main.py --input article.txt --duration 5 --output script.json

# Direct text input
python scripts/main.py --text "Medical research findings..." --duration 3

# From stdin
cat article.txt | python scripts/main.py --duration 5 --style conversational

# With specific style and pace
python scripts/main.py --input paper.txt --style educational --pace slow

Risk Assessment

Risk IndicatorAssessmentLevel
Code ExecutionPython script executed locallyLow
Network AccessNo external API callsLow
File System AccessRead input files, write output filesLow
Instruction TamperingStandard prompt guidelinesLow
Data ExposureOutput saved only to specified locationLow

Security Checklist

  • No hardcoded credentials or API keys
  • No unauthorized file system access
  • Output does not expose sensitive information
  • Prompt injection protections in place
  • Input validation for file paths
  • Output directory restricted to workspace
  • Script execution in sandboxed environment

Prerequisites

# Python 3.7+
# No additional packages required (uses standard library)

Evaluation Criteria

Success Metrics

  • Successfully converts text to audio-optimized script
  • Expands abbreviations and converts numbers to words
  • Calculates estimated duration based on word count
  • Applies style-specific formatting
  • Provides pronunciation notes for medical terms

Test Cases

  1. Basic Conversion: Convert text file → Returns audio script with metadata
  2. Abbreviation Handling: Text with "e.g., i.e., etc." → All expanded in output
  3. Number Conversion: Input with "1 in 4" → Output with "one in four"

Lifecycle Status

  • Current Stage: Draft
  • Next Review Date: 2026-03-06
  • Known Issues: None
  • Planned Improvements:
    • Add support for custom abbreviation dictionaries
    • Integrate with text-to-speech engines
    • Add multilingual support

🎙️ Pro Tip: The best audio scripts sound natural when spoken. Always read your script aloud before finalizing—if you stumble over a sentence, your narrator will too. Revise for the ear, not the eye.

Files

4 total
Select a file
Select a file to preview.

Comments

Loading comments…