ReelTalk

v1.1.1

ReelTalk — drop any Instagram, TikTok, or YouTube Shorts URL and have a chat about it. Extracts audio or video frames, transcribes speech with Whisper, falls...

0· 79·1 current·1 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for ameshalexk/reeltalk.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "ReelTalk" (ameshalexk/reeltalk) from ClawHub.
Skill page: https://clawhub.ai/ameshalexk/reeltalk
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Required binaries: yt-dlp, whisper, tesseract, ffmpeg
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install reeltalk

ClawHub CLI

Package manager switcher

npx clawhub@latest install reeltalk
Security Scan
VirusTotalVirusTotal
Pending
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description request (Instagram/TikTok/YouTube Shorts transcription + OCR) lines up with the required binaries (yt-dlp, whisper, tesseract, ffmpeg) and the described workflow. No environment variables or unrelated services are requested.
Instruction Scope
The runtime steps stick to media download, audio extraction, Whisper transcription, frame extraction, and Tesseract OCR — all within the stated purpose. Two points to be aware of: (1) the notes suggest using yt-dlp's --cookies-from-browser for age-restricted or login-required content, which causes yt-dlp to access browser cookie stores (potentially sensitive browser profiles) even though no browser credentials are declared; (2) instructions optionally copy frames to $HOME/Desktop for OCR, which touches user home files. Both are plausible for the task but expand the agent's access beyond purely /tmp processing.
Install Mechanism
All install entries are Homebrew formulas (yt-dlp, whisper, tesseract, tesseract-lang). These are standard package installs rather than arbitrary URL downloads or extract actions, which is proportionate for the toolchain required.
Credentials
The skill requests no environment variables or external credentials (proportionate). However, use of --cookies-from-browser (optional) implicitly accesses browser data; this capability is not declared as a required config path or credential and should be considered when enabling the skill. Also the Whisper model will download (~461MB) and requires local disk space.
Persistence & Privilege
The skill is user-invocable, not always-on, and does not request persistent elevated privileges or modify other skill/system configurations. Normal installation behavior via brew is expected and limited.
Assessment
ReelTalk appears coherent for local transcription and OCR, but review these practical privacy and resource points before installing: - Browser cookies: The SKILL.md suggests using yt-dlp's --cookies-from-browser for gated content. That makes yt-dlp read your browser profile/cookies. Only use this if you trust the environment and prefer supplying a limited cookie file instead of allowing automatic browser extraction. - Local storage: Whisper will download a model (~461MB) on first run and transcriptions/frames are written to disk (tmp and optionally $HOME/Desktop). Ensure you have disk space and consider running in an environment where temporary files are safe. - File access: The skill may read/write /tmp and optionally your Desktop. If you need stricter isolation, run it in a sandbox/container or change the paths to a dedicated working directory. - Copyright and terms: Downloading media may violate platform terms or copyright; ensure you have the right to process the content. - Keep tools up to date: yt-dlp, ffmpeg, and Whisper are security-relevant; install from official Homebrew packages and update regularly. If you are comfortable with local disk use and optionally allowing cookie access for gated content, the skill's behavior and requirements are proportionate to its stated purpose.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

🎬 Clawdis
OSmacOS · Linux
Binsyt-dlp, whisper, tesseract, ffmpeg

Install

Install yt-dlp (brew)
Bins: yt-dlp
brew install yt-dlp
Install Whisper (brew)
Bins: whisper
brew install whisper
Install Tesseract OCR (brew)
Bins: tesseract
brew install tesseract
Install Hindi language pack for Tesseract (brew)brew install tesseract-lang
latestvk973g8b6nrgk227keaw24p4r3d85q915
79downloads
0stars
4versions
Updated 7h ago
v1.1.1
MIT-0
macOS, Linux

ReelTalk

Accept any Instagram, TikTok, or YouTube Shorts URL and get a full transcription, plain-English summary, and the ability to keep asking follow-up questions about the content — all processed entirely locally.

What it does

  1. Receive any Instagram, TikTok, or YouTube Shorts URL from the user.
  2. Extract the audio track using yt-dlp and transcribe with Whisper.
  3. If no speech detected (music-only, silent, or failed transcription), fall back to OCR: download the video, extract frames at 1 fps, run Tesseract OCR on each frame, and aggregate the text.
  4. Summarize what was said (or shown on screen) in plain English.
  5. Continue the conversation — the user can ask follow-ups, dig deeper, or discuss.

When to trigger

When the user shares a URL from any supported platform:

  • Instagram — reels, posts, stories, videos (instagram.com, instagr.am)
  • TikTok — videos (tiktok.com, vm.tiktok.com, vt.tiktok.com, tiktok.tv)
  • YouTube — Shorts (youtube.com/shorts/, youtu.be short links)

Also trigger when the user pastes a link without context (e.g., drops a bare URL into chat).

Workflow

Audio path (speech content)

  1. Run yt-dlp --list-formats <url> to find available formats.
  2. For audio-only extraction: yt-dlp -f "bestaudio" -o "/tmp/reel_audio.%(ext)s" "<url>"
  3. Transcribe: whisper /tmp/reel_audio.m4a --model small --language en --task transcribe
  4. If transcription yields meaningful text → summarize and chat.
  5. Clean up: rm -f /tmp/reel_audio.*

OCR fallback path (text-on-screen / music-only)

If Whisper returns empty, very short, or clearly hallucinated output (e.g. music interpreted as words):

  1. Download the highest-quality video format: yt-dlp -f "bv*+ba/b" -o "/tmp/reel_video.mp4" "<url>"
  2. Extract frames at 1 fps: ffmpeg -i /tmp/reel_video.mp4 -vf "fps=1" -vsync vfr -q:v 2 /tmp/reel_frame_%02d.jpg
  3. OCR each frame with Tesseract (try English first, then Hindi if supported):
    • tesseract /tmp/reel_frame_XX.jpg stdout --psm 6
    • Alternatively copy to $HOME/Desktop/ first if /tmp/ path causes issues.
  4. Aggregate all OCR text, deduplicate similar frames, summarize the on-screen content.
  5. Clean up: rm -f /tmp/reel_video.mp4 /tmp/reel_frame_*.jpg

TikTok-specific notes

  • TikTok blocks unauthenticated API — yt-dlp handles extraction automatically.
  • TikTok videos often have watermarked and watermark-free formats; prefer h264_* formats.
  • Some TikTok URLs redirect (t.tiktok.com, vm.tiktok.com) — yt-dlp follows these automatically.

YouTube-specific notes

  • YouTube Shorts are served as standard video formats; no special handling needed.
  • Use --cookies-from-browser if age-restricted or login-required content fails.

Requirements

  • yt-dlp (Homebrew: brew install yt-dlp)
  • whisper (OpenAI Whisper, Homebrew: brew install whisper or pip install openai-whisper)
  • tesseract (Homebrew: brew install tesseract)
  • ffmpeg (typically already installed as yt-dlp dependency)
  • Optional: tesseract-lang for Hindi/other language OCR support (brew install tesseract-lang)

Notes

  • All processing is local — nothing sent to external APIs except the initial URL fetch.
  • The small English Whisper model balances speed vs accuracy on CPU.
  • First Whisper run downloads the model (~461MB), then cached.
  • Text-on-screen reels with background music (common on Instagram/TikTok) will automatically fall through to OCR.
  • Copy frames to $HOME/Desktop/ for OCR if Tesseract has issues with /tmp/ paths (macOS extended attributes can interfere).
  • For long videos (>5 min), consider using base Whisper model for speed, or extract shorter segments.

Tags

instagram, tiktok, youtube, shorts, reel, audio, transcription, whisper, speech-to-text, ocr, tesseract, yt-dlp, ffmpeg, media, summarization, video, text-on-screen

Comments

Loading comments...