Voice Recognition

v1.0.0

Local speech-to-text with OpenAI Whisper CLI. Supports Chinese, English, 100+ languages with translation and summarization.

⭐ 3· 1.9k·6 current·8 all-time

by@gykdly

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for gykdly/voice-recognition.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "Voice Recognition" (gykdly/voice-recognition) from ClawHub.
Skill page: https://clawhub.ai/gykdly/voice-recognition
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install voice-recognition

ClawHub CLI

Package manager switcher

npx clawhub@latest install voice-recognition

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Benign

high confidence

ℹ

Purpose & Capability

The name/description (local Whisper-based speech-to-text) match the included Python script and the SKILL.md. The README asks you to install openai-whisper via Homebrew and use Python 3.10+, which is appropriate. Minor oddity: usage examples in SKILL.md hard-code an absolute path (/Users/liyi/.openclaw/workspace/...) pointing to a specific user's workspace — this is inconsistent with distributing the script and should be updated to relative or generic paths.

ℹ

Instruction Scope

Runtime instructions simply run the included Python script which calls the external 'whisper' CLI (no shell=True). The script reads an audio file, writes a .txt transcript beside that file, and can generate a simple local summary. It does not read unrelated system files or environment variables, nor does it post data to remote endpoints. Note: first run will download model weights to ~/.cache/whisper (network and disk usage).

✓

Install Mechanism

There is no install spec (instruction-only skill). The SKILL.md recommends 'brew install openai-whisper' which is a reasonable, low-risk installation path for the Whisper CLI.

✓

Credentials

The skill requests no environment variables, no credentials, and no config paths. The behavior (invoking a local 'whisper' binary) is proportionate to the stated function. Reminder: because it calls an external binary by name, it depends on the 'whisper' in PATH being the expected implementation.

✓

Persistence & Privilege

The skill does not request permanent/always inclusion, does not modify other skills, and contains no code that attempts to change system-wide agent settings. It only suggests an optional shell alias for convenience (user action).

Assessment

This skill appears to do what it says: a small Python wrapper that invokes the local OpenAI Whisper CLI and writes transcripts locally. Before installing/use: (1) install openai-whisper from a trusted source (Homebrew tap) so the 'whisper' binary on your PATH is legitimate; (2) be aware the first run will download model weights to ~/.cache/whisper (large download and disk usage); (3) update the SKILL.md usage examples to point to the script location on your system instead of the hard-coded /Users/liyi/... path, and only create the suggested alias if you trust the script location; (4) transcripts are written next to the input audio file — check permissions and disk location; (5) if you want to reduce risk, run the script in an isolated environment (container or VM) until you confirm behavior. No signs of credential exfiltration or remote endpoints were found in the included files.

Like a lobster shell, security has layers — review code before you run it.

latestvk975qyjn3qvv9krmhcs0kw396n80zw0z

1.9kdownloads

3stars

1versions

Updated 2mo ago

v1.0.0

MIT-0

Voice Recognition (Whisper)

Local speech-to-text with OpenAI Whisper CLI.

Features

Local processing - No API key needed, free
Multi-language - Chinese, English, 100+ languages
Translation - Translate to English
Summarization - Generate quick summary

Usage

Basic

# Chinese recognition
python3 /Users/liyi/.openclaw/workspace/scripts/voice识别_升级版.py audio.m4a

# Force Chinese
python3 /Users/liyi/.openclaw/workspace/scripts/voice识别_升级版.py audio.m4a --zh

# English recognition  
python3 /Users/liyi/.openclaw/workspace/scripts/voice识别_升级版.py audio.m4a --en

# Translate to English
python3 /Users/liyi/.openclaw/workspace/scripts/voice识别_升级版.py audio.m4a --translate

# With summary
python3 /Users/liyi/.openclaw/workspace/scripts/voice识别_升级版.py audio.m4a --summarize

Quick Command (add to ~/.zshrc)

alias voice="python3 /Users/liyi/.openclaw/workspace/scripts/voice识别_升级版.py"

Then use:

voice ~/Downloads/audio.m4a --zh

Requirements

OpenAI Whisper CLI: brew install openai-whisper
Python 3.10+

Files

scripts/voice识别_升级版.py - Main script
scripts/voice_tool_README.md - Documentation

Supported Formats

MP3, M4A, WAV, OGG, FLAC, WebM

Language Support

100+ languages including:

Chinese (zh)
English (en)
Japanese (ja)
Korean (ko)
And more...

Notes

Default model: medium (balance of speed and accuracy)
First run downloads model to ~/.cache/whisper
Processing time varies by audio length and model size

Comments

Loading comments...