TG Voice Whisper Transcriber

v1.0.0

Automation skill for TG Voice Whisper Transcriber.

1· 3.5k·18 current·19 all-time
byRigdenDjapo@drones277
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
The name/purpose (transcribe Telegram .ogg with local Whisper) aligns with the required binaries (whisper, ffmpeg) and the instructions that process files in /root/.openclaw/media/inbound. No unrelated credentials or unrelated binaries are requested.
!
Instruction Scope
The runtime instructions operate on /root/.openclaw/media/inbound which is expected, but they contain ambiguous/buggy commands (use of the literal token PATH rather than an explicit filepath variable; the rm invocation 'rm PATH /tmp/whisper/*' appears incorrect and could fail to remove or remove the wrong files). The docs also instruct spawning a sub-agent/cron loop every 5s — this gives the skill persistent behavior and broad runtime impact. The instructions assume agent-side primitives like sessions_spawn/cron add exist and will be executed as written; they are permissive and not robustly specified.
!
Install Mechanism
Install spec uses apt to install ffmpeg (reasonable) and pip to install openai-whisper. openai-whisper will pull additional heavy dependencies (e.g., torch) and will download model weights on first run. The pip flag '--break-system-packages' is aggressive and can interfere with system-managed Python packages — this is a risky install step and should be justified or avoided. The SKILL.md understates dependency size and network activity (it claims 'fully offline after install' but model and package downloads occur during install/run).
Credentials
No environment variables, credentials, or unrelated config paths are requested; this is proportionate to the stated local transcription purpose.
!
Persistence & Privilege
Although always:false, the instructions explicitly tell the agent to spawn background sessions or add a cron job that runs every 5s. That creates persistent autonomous behavior and a continuous processing loop — a legitimate feature for auto-transcription but a notable privilege (background execution and recurring tasks). Users should confirm they want such persistent tasks and verify the exact commands before enabling them.
What to consider before installing
This skill is coherent with its stated purpose but has several concerning/ambiguous bits you should resolve before installing: 1) Fix the command placeholders — the SKILL.md uses 'PATH' literally; ensure file path substitution is safe (e.g., use "$FILEPATH"). 2) Correct the cleanup command — 'rm PATH /tmp/whisper/*' looks wrong and may not delete the intended file(s); use 'rm "$FILEPATH" /tmp/whisper/*' or explicit paths. 3) Reconsider '--break-system-packages' in pip install; prefer a virtualenv/container to avoid damaging system Python packages. 4) Expect large Python dependencies (torch) and model downloads during first run despite the 'offline after install' claim; ensure network access and disk are acceptable. 5) The skill proposes spawning persistent background sessions/cron jobs every 5s — verify you want continuous background processing and reduce frequency if needed. 6) Run this in a sandboxed environment (container or VM) if you are not comfortable with pip/system changes or persistent background agents. If the author can provide corrected example commands, a safer install path (virtualenv), and an explanation for the --break-system-packages flag, confidence would increase.

Like a lobster shell, security has layers — review code before you run it.

latestvk971mfsvzm9jdms3wpx4sf1kgs80ts92
3.5kdownloads
1stars
1versions
Updated 1mo ago
v1.0.0
MIT-0

name: tg-voice-whisper description: Auto-transcribe Telegram voice messages (.ogg Opus) to text using local OpenAI Whisper (tiny model). Reply with transcription + auto-delete file for privacy. No API keys, fully offline after install. emoji: 🎙️🔊

requires: bins: [whisper, ffmpeg]

install:

  • kind: apt packages: [ffmpeg]
  • kind: pip packages: [openai-whisper] flags: [--break-system-packages]

min-server: 4GB RAM recommended for fast/stable work (cached model <1s transcribe on 1vCPU).

Safe: Local-only, auto-rm files, no external APIs/credentials stored.

Usage (agent): When voice .ogg arrives in /root/.openclaw/media/inbound/:

exec 'whisper PATH --model tiny --language ru --output_format txt --output_dir /tmp/whisper; text=$(cat /tmp/whisper/*.txt); message action=send message="🔊 [text]" reply_to_current; rm PATH /tmp/whisper/*'

Auto-setup (non-block): Spawn sub-agent or cron every 5s:

sessions_spawn task="LOOP: find /root/.openclaw/media/inbound/*.ogg -mmin -1 → background whisper ru tiny → message reply text → rm" label="voice-auto" cleanup="keep"

Or cron:

cron add job={"name":"voice-auto","schedule":{"kind":"every","everyMs":5000},"payload":{"kind":"systemEvent","text":"🔊 VOICE_CHECK"},"sessionTarget":"main"}

Test: whisper /path.ogg --model tiny --language ru

Notes:

  • First run: ~15s model download (72MB ~/.cache/whisper/tiny.bin).
  • Cached: <1s on 1vCPU/4GB.
  • Languages: ru/en best; --language detect auto.
  • Accuracy: tiny 85-95% ru speech; upgrade to base/small for better.

Comments

Loading comments...