Local Whisper

WarnAudited by ClawScan on May 10, 2026.

Overview

This skill is mostly a local transcription tool, but review is needed because it can fall back to OpenAI/Groq cloud transcription despite strong local-privacy claims, and one helper script has unsafe path handling.

Install only if you are comfortable reviewing and constraining it. For a truly local setup, force the backend to MLX/local, avoid exposing OPENAI_API_KEY or GROQ_API_KEY to the daemon, run it in a virtual environment, do not use transcribe_large.sh on untrusted filenames until fixed, and inspect any LaunchAgent before enabling auto-start.

Findings (7)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

Concern

ASI09: Human-Agent Trust Exploitation

What this means

A user may install this expecting all audio to stay local, while some environments could send voice messages to a cloud transcription provider.

Why it was flagged

The user-facing description makes an unconditional local-only and no-API-key privacy promise, but the included transcriber code supports cloud OpenAI/Groq backends and can select them automatically.

Skill content

**Transcribe voice messages for free on Telegram and WhatsApp.** No API keys. No costs. Runs on your Mac. ... ✅ Private (audio never leaves your Mac)

Recommendation

Change the documentation to clearly disclose cloud backends, or remove/disable them by default and fail closed unless the user explicitly opts into a cloud backend.

Concern

ASI07: Insecure Inter-Agent Communication

What this means

Private voice messages may be transmitted to OpenAI or Groq under fallback conditions instead of staying on the Mac.

Why it was flagged

The default auto backend can choose a cloud provider when MLX is unavailable and provider credentials exist, creating an external audio data flow that is not clearly described in the main skill instructions.

Skill content

Supports multiple backends:\n- MLX (Apple Silicon) - fastest local option\n- OpenAI Whisper API - cloud\n- Groq API - fast & cheap cloud\n...\nelif OPENAI_AVAILABLE and os.getenv('OPENAI_API_KEY'):\n    backend = "openai"\nelif GROQ_AVAILABLE and os.getenv('GROQ_API_KEY'):\n    backend = "groq"

Recommendation

Make local MLX the only default backend for this skill; require an explicit command-line flag or config setting before any provider API can receive audio.

Concern

ASI03: Identity and Privilege Abuse

What this means

The skill could use a user's existing API account and incur provider usage without the registry-level credential contract making that clear.

Why it was flagged

The code reads existing OpenAI/Groq credentials from the environment even though the registry metadata declares no required credentials or environment variables.

Skill content

self.api_key = api_key or os.getenv('OPENAI_API_KEY')\nself.client = OpenAI(api_key=self.api_key)\n...\nself.api_key = api_key or os.getenv('GROQ_API_KEY')\nself.client = Groq(api_key=self.api_key)

Recommendation

Declare optional credentials in metadata and require explicit user opt-in before using OPENAI_API_KEY or GROQ_API_KEY.

Concern

ASI05: Unexpected Code Execution

What this means

A maliciously named local audio file could alter the Python code executed by this helper script.

Why it was flagged

The shell script interpolates the user-supplied audio path directly into generated Python source without Python string escaping.

Skill content

/usr/bin/python3 << EOF\n...\nresult = t.transcribe("$AUDIO_FILE")\nprint(result)\nEOF

Recommendation

Pass the audio path as an argv parameter or environment variable and read it from Python safely, rather than embedding it directly into a here-document.

Note

ASI02: Tool Misuse and Exploitation

What this means

Another local process could potentially obtain transcripts of audio files if it knows their paths.

Why it was flagged

The daemon is bound to localhost, but any local process that can reach it can request transcription of an arbitrary file path readable by the user's account.

Skill content

audio_path = data.get('file') or data.get('path')\n...\ntext = transcriber.transcribe(audio_path, language=language)\n...\nserver = HTTPServer(('127.0.0.1', args.port), WhisperHandler)

Recommendation

Keep the service bound to 127.0.0.1, consider adding a local token or path allowlist, and avoid leaving the daemon running when not needed.

Note

ASI04: Agentic Supply Chain Vulnerabilities

What this means

Installed package contents may change over time, and extra cloud-provider libraries increase the installed attack surface.

Why it was flagged

The dependency list uses unpinned minimum versions and includes cloud-provider clients in a skill advertised as local-only.

Skill content

openai>=1.12.0\ngroq>=0.4.0\nfaster-whisper>=1.0.0\nlightning-whisper-mlx>=0.0.10; sys_platform == "darwin" and platform_machine == "arm64"

Recommendation

Pin reviewed dependency versions, preferably with hashes or a lockfile, and separate optional cloud dependencies from the local-only install path.

Note

ASI10: Rogue Agents

What this means

The transcription daemon may keep running after login and continue accepting localhost transcription requests.

Why it was flagged

The skill documents optional login persistence for the daemon, which is purpose-aligned but should remain an explicit user choice.

Skill content

## Auto-Start on Login\n\n```bash\ncp com.local-whisper.plist ~/Library/LaunchAgents/\nlaunchctl load ~/Library/LaunchAgents/com.local-whisper.plist\n```

Recommendation

Only enable auto-start if needed, inspect the LaunchAgent plist before loading it, and unload it when you no longer use the skill.