Music Analysis

v3.0.2

Analyze music/audio files locally without external APIs. Extract tempo, pocket/groove feel, pulse stability, swing proxy, section/repetition structure, key c...

0· 565· 7 versions· 5 current· 5 all-time· Updated 1d ago· MIT-0

Install

openclaw skills install music-analysis

Music Analysis (Local, No External APIs)

Primary tool: a full listen that combines snapshot analysis, structure, groove, harmonic tension, temporal mood mapping, and optional Whisper lyric alignment into one report.

1. Full Listen — primary / recommended

python3 skills/music-analysis/scripts/listen.py /path/to/audio.mp3
python3 skills/music-analysis/scripts/listen.py track.mp3 --json
python3 skills/music-analysis/scripts/listen.py track.mp3 --out report.txt
python3 skills/music-analysis/scripts/listen.py track.mp3 --json --out report.json

What it does in one pass:

  1. Snapshot analysis: tempo, pulse stability, swing proxy, key clarity, harmonic tension, timbre, structure
  2. Whisper lyric transcription and filtering first — keep only real lyric text, drop artifact tags like [MUSIC]
  3. Temporal listen: windowed energy / mood / tension journey
  4. Synthesis layer that aligns lyrics with peak / tension / quiet windows and lets the lyric layer override the final vibe when confidence is high

Human-readable output structure

  • SNAPSHOT
    • groove/pocket
    • structure summary + repeated sections
    • harmony (key clarity + tension)
    • timbre descriptor tags
  • INSTRUMENT READ
    • likely instrument palette (strong/likely/possible confidence)
    • per-section instrument entrances and exits
    • how instruments color the emotional feel
    • written as natural language, not clinical data
  • TEMPORAL JOURNEY
    • opening / middle / closing mood-energy-tension read
    • peak / quietest / tensest moments
    • mood journey and transition count
  • EMOTIONAL READ
    • explainable emotion summary based on measured features
  • LYRICS
    • Whisper segment count
    • excerpt or graceful skip note
  • SYNTHESIS
    • lyric-energy/tension alignment
    • peak / tension / quiet lyric moments
  • ALIGNED TIMELINE
    • per-window moments where transitions / lyrics / tension spikes occur

2. Snapshot Analysis — standalone

python3 skills/music-analysis/scripts/analyze_music.py /path/to/audio.mp3
python3 skills/music-analysis/scripts/analyze_music.py track.mp3 --json

Reports:

  • tempo / pulse stability / pulse confidence / swing proxy / pocket
  • key estimate / key clarity / chroma entropy / harmonic change / tonal motion / tension
  • timbre descriptors (brightness, richness, low-end, contrast, dynamic range)
  • section labels (A/B/C...) and repeated material detection
  • explainable emotional read with reasons

3. Temporal Listen — standalone

python3 skills/music-analysis/scripts/temporal_listen.py /path/to/audio.mp3
python3 skills/music-analysis/scripts/temporal_listen.py track.mp3 --json

Reports:

  • sliding-window timeline (4s windows, 2s hops)
  • energy contour
  • mood labels
  • harmonic tension + tonal motion
  • transition types (drop hits, pulls back, tightens harmonically, shifts color, evolves)
  • narrative arc (mountain / ascending / descending / plateau / wave)

Interpretation rules

  • Structure labels are similarity labels, not verse/chorus claims.
  • Swing proxy is a feel estimate, not drummer-grade microtiming truth.
  • Emotion is explainable, derived from pulse + timbre + harmonic tension rather than a black-box mood guess.
  • Lyrics can override the final vibe when filtered Whisper text is confident and emotionally clear.

Audio sourcing

The tool needs a real audio file on disk.

  • Direct file (mp3, wav, flac, ogg, m4a — anything ffmpeg/librosa can read)
  • YouTube / supported URLs: yt-dlp -x --audio-format mp3 -o "output.mp3" "URL_OR_SEARCH"

Whisper lyrics transcription

listen.py uses:

  • CLI: /opt/homebrew/bin/whisper-cli
  • Model: ~/.local/share/whisper-cpp/ggml-large-v3-turbo.bin
  • Preprocess: convert input to mono 16kHz WAV via ffmpeg
  • Fallback: skip gracefully if Whisper is missing or errors

Dependencies

Python:

  • librosa
  • numpy

System:

  • ffmpeg
  • ffprobe

Workspace hygiene

  • Keep temporary audio files in a dedicated temp/output folder for the skill.
  • Avoid modifying unrelated project files while working on audio analysis tasks.

Version tags

latestvk97b8nzx6gjr9c2sm97cs5j5ys82zrqb