Voice Transcribe

Transcribe audio files using OpenAI's gpt-4o-mini-transcribe model with vocabulary hints and text replacements. Requires uv (https://docs.astral.sh/uv/).

MIT-0 · Free to use, modify, and redistribute. No attribution required.
12 · 4.5k · 25 current installs · 27 all-time installs
MIT-0
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Suspicious
high confidence
!
Purpose & Capability
The skill name/description (voice transcription via OpenAI) is reasonable, but the SKILL.md asks the user to put OPENAI_API_KEY in a hardcoded path (/Users/darin/.../.env) and to run 'uv run /Users/darin/clawd/skills/voice-transcribe/transcribe'. The package metadata declares no required env vars and includes no executable named 'transcribe'. That mismatch (hardcoded user path + undeclared credential + missing executable) is inconsistent with the stated purpose and deployment model.
!
Instruction Scope
The instructions tell humans/agents to run a 'transcribe' command at an absolute path and to store an OpenAI API key in a specific file — actions outside the skill bundle. They also mention caching and post-processing replacements. Because there is no included code or executable, the instructions are ambiguous and assume local artifacts and secrets that the skill metadata does not disclose.
Install Mechanism
There is no install spec (instruction-only), which is lower risk in itself. However, absence of an install plus references to running an external 'transcribe' binary means the runtime will rely on external tooling (uv and an executable/script) that are not provided; verify where that code comes from before running.
!
Credentials
Metadata claims no required env vars or primary credential, but SKILL.md explicitly instructs placing OPENAI_API_KEY into a local .env file. That is a direct mismatch: the skill needs an API key to function but does not declare it. Also the instructions encourage storing the key in a hardcoded, user-specific path, which is a poor and potentially unsafe practice.
Persistence & Privilege
The skill does not request always:true and does not declare persistent system-wide modifications. Autonomous invocation is allowed by default (normal). There is no evidence the skill attempts to change other skills or system settings.
What to consider before installing
Do not install or run this skill until the author clarifies and fixes these issues: (1) The SKILL.md requires an OPENAI_API_KEY but the metadata lists none—ask the author to declare required env vars (and prefer platform secret storage rather than a hardcoded .env file). (2) The instructions require running a 'transcribe' executable at /Users/darin/... but no executable or install steps are provided—ask where that binary comes from and request an install spec or included code. (3) Confirm the role of 'uv' (astral.sh) and ensure you trust that runtime. (4) Avoid placing API keys in arbitrary files; if you test, use a throwaway key and inspect the actual code that will run. If the author cannot supply the missing files or a credible install source (e.g., a GitHub release or vetted package), treat this skill as unreliable and do not give it secrets or run it with sensitive audio.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.0.1
Download zip
latestvk9798hzptmc339c8be36gfs6cx7ymmn7

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

SKILL.md

voice-transcribe

transcribe audio files using openai's gpt-4o-mini-transcribe model.

when to use

when receiving voice memos (especially via whatsapp), just run:

uv run /Users/darin/clawd/skills/voice-transcribe/transcribe <audio-file>

then respond based on the transcribed content.

fixing transcription errors

if darin says a word was transcribed wrong, add it to vocab.txt (for hints) or replacements.txt (for guaranteed fix). see sections below.

supported formats

  • mp3, mp4, mpeg, mpga, m4a, wav, webm, ogg, opus

examples

# transcribe a voice memo
transcribe /tmp/voice-memo.ogg

# pipe to other tools
transcribe /tmp/memo.ogg | pbcopy

setup

  1. add your openai api key to /Users/darin/clawd/skills/voice-transcribe/.env:
    OPENAI_API_KEY=sk-...
    

custom vocabulary

add words to vocab.txt (one per line) to help the model recognize names/jargon:

Clawdis
Clawdbot

text replacements

if the model still gets something wrong, add a replacement to replacements.txt:

wrong spelling -> correct spelling

notes

  • assumes english (no language detection)
  • uses gpt-4o-mini-transcribe model specifically
  • caches by sha256 of audio file

Files

3 total
Select a file
Select a file to preview.

Comments

Loading comments…