Whisper Transcribe
Transcribe audio files to text using OpenAI Whisper. Supports speech-to-text with auto language detection, multiple output formats (txt, srt, vtt, json), batch processing, and model selection (tiny to large). Use when transcribing audio recordings, podcasts, voice messages, lectures, meetings, or any audio/video file to text. Handles mp3, wav, m4a, ogg, flac, webm, opus, aac formats.
MIT-0 · Free to use, modify, and redistribute. No attribution required.
⭐ 2 · 1k · 5 current installs · 6 all-time installs
by@JosunLP
MIT-0
Security Scan
OpenClaw
Benign
high confidencePurpose & Capability
Name/description (Whisper transcription) matches the included script and SKILL.md. Required tools (whisper CLI, ffmpeg) are appropriate for the stated functionality; no unrelated binaries or credentials are requested.
Instruction Scope
SKILL.md and the script only instruct running the local wrapper against user-provided audio files and writing transcripts to the same or a specified output directory. The script does not read extraneous system files, environment variables, or attempt network exfiltration itself.
Install Mechanism
This is an instruction-only skill (no install spec). The SKILL.md recommends installing the openai-whisper package via pip; that will download packages and the whisper runtime may download model files (~MBs to GBs) at first run. This network activity and package installation is expected for this purpose but is the primary point where external code/data is fetched—run in a virtualenv or isolated environment if you want to limit risk.
Credentials
No environment variables, credentials, or config paths are requested. The script does not access secrets or unrelated system configurations.
Persistence & Privilege
always:false and no install-time persistence. The skill does not modify other skills or system-wide settings and requests no elevated privileges.
Assessment
This skill appears to do what it claims: wrap the local 'whisper' CLI to transcribe audio. Before installing/using it: (1) ensure you trust the 'openai-whisper' pip package source and install it in a virtualenv to limit install-time risks, (2) install ffmpeg separately (the script assumes it exists), (3) be aware models are downloaded at first run and may be large and require network access and disk space, and (4) avoid running the tool on sensitive audio in untrusted environments. If you want extra assurance, inspect the pip package source before installing or run the transcription inside a container/VM.Like a lobster shell, security has layers — review code before you run it.
Current versionv1.0.0
Download zipaudiolatestspeech-to-textsrtsubtitlestranscriptionwhisper
License
MIT-0
Free to use, modify, and redistribute. No attribution required.
SKILL.md
Whisper Transcribe
Transcribe audio with scripts/transcribe.sh:
# Basic (auto-detect language, base model)
scripts/transcribe.sh recording.mp3
# German, small model, SRT subtitles
scripts/transcribe.sh --model small --language de --format srt lecture.wav
# Batch process, all formats
scripts/transcribe.sh --format all --output-dir ./transcripts/ *.mp3
# Word-level timestamps
scripts/transcribe.sh --timestamps interview.m4a
Models
| Model | RAM | Speed | Accuracy | Best for |
|---|---|---|---|---|
| tiny | ~1GB | ⚡⚡⚡ | ★★ | Quick drafts, known language |
| base | ~1GB | ⚡⚡ | ★★★ | General use (default) |
| small | ~2GB | ⚡ | ★★★★ | Good accuracy |
| medium | ~5GB | 🐢 | ★★★★★ | High accuracy |
| large | ~10GB | 🐌 | ★★★★★ | Best accuracy (slow on Pi) |
Output Formats
- txt — Plain text transcript
- srt — SubRip subtitles (for video)
- vtt — WebVTT subtitles
- json — Detailed JSON with timestamps and confidence
- all — Generate all formats at once
Requirements
whisperCLI (pip install openai-whisper)ffmpeg(for audio decoding)- First run downloads the model (~150MB for base)
Files
2 totalSelect a file
Select a file to preview.
Comments
Loading comments…
