Local Whisper
PassAudited by ClawScan on May 1, 2026.
Overview
This skill appears coherent for local speech-to-text, but it requires user-run setup scripts that download/build third-party components and persistently configure OpenClaw to transcribe incoming audio.
This looks reasonable for local Whisper speech-to-text. Before installing, make sure you trust the upstream whisper.cpp repository and Hugging Face model downloads, and understand that the setup will persistently change OpenClaw's audio configuration to run the local transcriber for incoming audio.
Findings (4)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
Installing depends on whatever code is available from the upstream repository at setup time.
The install script fetches and builds whisper.cpp from the live upstream repository rather than a pinned commit or verified release.
git clone https://github.com/ggerganov/whisper.cpp "$REPO" ... git -C "$REPO" pull --ff-only
Install only if you trust the upstream whisper.cpp source; consider pinning a known commit or release if you need reproducible installs.
Model files are downloaded from an external provider and trusted locally.
The model downloader retrieves model files from Hugging Face without checksum verification.
BASE_URL="https://huggingface.co/ggerganov/whisper.cpp/resolve/main" ... curl -L --fail --retry 3 --retry-delay 2 -o "$out" "$url"
Use trusted network sources and verify model checksums if integrity is important in your environment.
After setup, inbound audio can trigger the local transcription command automatically as part of OpenClaw audio handling.
The script configures OpenClaw to invoke a local CLI wrapper for audio media paths.
openclaw config set --strict-json tools.media.audio.models "[ ... \"command\": \"$WRAPPER_PATH\", \"args\": [\"{{MediaPath}}\"], \"timeoutSeconds\": 120 ... ]"Review the configured wrapper path and only enable this if you want OpenClaw to process inbound audio with the local command.
The installation creates a local executable that OpenClaw can later run for transcription.
The user-directed setup compiles and installs a native executable from the downloaded whisper.cpp source.
cmake -S "$REPO" -B "$BUILD" -DCMAKE_BUILD_TYPE=Release cmake --build "$BUILD" -j"$(nproc)" install -m 755 "$BUILD/bin/whisper-cli" "$HOME/.local/bin/whisper-cli"
Run the setup only on a machine where you are comfortable building and installing local native tools under your user account.
