Local Whisper

PassAudited by ClawScan on May 1, 2026.

Overview

This skill appears coherent for local speech-to-text, but it requires user-run setup scripts that download/build third-party components and persistently configure OpenClaw to transcribe incoming audio.

This looks reasonable for local Whisper speech-to-text. Before installing, make sure you trust the upstream whisper.cpp repository and Hugging Face model downloads, and understand that the setup will persistently change OpenClaw's audio configuration to run the local transcriber for incoming audio.

Findings (4)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

NoteHigh Confidence

ASI04: Agentic Supply Chain Vulnerabilities

What this means

Installing depends on whatever code is available from the upstream repository at setup time.

Why it was flagged

The install script fetches and builds whisper.cpp from the live upstream repository rather than a pinned commit or verified release.

Skill content

git clone https://github.com/ggerganov/whisper.cpp "$REPO" ... git -C "$REPO" pull --ff-only

Recommendation

Install only if you trust the upstream whisper.cpp source; consider pinning a known commit or release if you need reproducible installs.

NoteHigh Confidence

ASI04: Agentic Supply Chain Vulnerabilities

What this means

Model files are downloaded from an external provider and trusted locally.

Why it was flagged

The model downloader retrieves model files from Hugging Face without checksum verification.

Skill content

BASE_URL="https://huggingface.co/ggerganov/whisper.cpp/resolve/main" ... curl -L --fail --retry 3 --retry-delay 2 -o "$out" "$url"

Recommendation

Use trusted network sources and verify model checksums if integrity is important in your environment.

NoteHigh Confidence

ASI02: Tool Misuse and Exploitation

What this means

After setup, inbound audio can trigger the local transcription command automatically as part of OpenClaw audio handling.

Why it was flagged

The script configures OpenClaw to invoke a local CLI wrapper for audio media paths.

Skill content

openclaw config set --strict-json tools.media.audio.models "[ ... \"command\": \"$WRAPPER_PATH\", \"args\": [\"{{MediaPath}}\"], \"timeoutSeconds\": 120 ... ]"

Recommendation

Review the configured wrapper path and only enable this if you want OpenClaw to process inbound audio with the local command.

NoteHigh Confidence

ASI05: Unexpected Code Execution

What this means

The installation creates a local executable that OpenClaw can later run for transcription.

Why it was flagged

The user-directed setup compiles and installs a native executable from the downloaded whisper.cpp source.

Skill content

cmake -S "$REPO" -B "$BUILD" -DCMAKE_BUILD_TYPE=Release
cmake --build "$BUILD" -j"$(nproc)"
install -m 755 "$BUILD/bin/whisper-cli" "$HOME/.local/bin/whisper-cli"

Recommendation

Run the setup only on a machine where you are comfortable building and installing local native tools under your user account.