Voice Recognition
SuspiciousAudited by ClawScan on May 10, 2026.
Overview
The skill mostly matches its local speech-to-text purpose, but it has a risky runtime import fallback that can load Python code from /tmp and an installer path that can alter system Python packages.
Review before installing. Prefer running it in an isolated virtual environment or container, do not run the installer as an administrator, and remove or disable the /tmp/whisper-venv import fallback before using it with sensitive audio. Expect network downloads for dependencies and Whisper models during setup or first use.
Findings (4)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
A malicious or accidental package placed under /tmp/whisper-venv could run code when the user transcribes audio, outside the documented skill environment.
The script prepends a /tmp-based site-packages directory before importing Whisper and other libraries. If that directory exists, Python package code from that location can execute during import.
_VENV_PATHS = ['/tmp/whisper-venv/lib/python3.12/site-packages', ...]; ... sys.path.insert(0, p); ... import whisper
Remove the /tmp import fallback or replace it with a verified per-skill virtual environment path with ownership and permission checks.
Running the installer could modify the user's system Python environment and cause package conflicts beyond this skill.
If virtualenv setup or pip installation fails, the installer can use system pip and retry with --break-system-packages, which bypasses Python environment protections without a separate approval step.
venv_dir = None ... pip_path = 'pip3' ... [pip_path, 'install', '--break-system-packages', pkg]
Keep installation inside a dedicated virtual environment, avoid --break-system-packages by default, and ask the user before making system-level package changes.
A later package update or supply-chain compromise could affect what code is installed.
The dependencies are expected for local transcription, but they are lower-bound/unpinned packages without hashes, so future installs may pull different code.
openai-whisper>=20231117 soundfile>=0.12.0 numpy>=1.21.0 torch>=2.0.0
Pin dependency versions and hashes, and document the expected package sources.
Users may be surprised by large downloads or assume the skill never needs network access.
The headline offline claim is partially contradicted by the disclosed first-run model download. There is no artifact evidence that audio is uploaded, but network access is still needed for setup/model retrieval.
No API keys. No internet required. 100% private. ... First run downloads the Whisper model (~139MB for base, ~461MB for small).
Clarify that transcription is local after dependencies and models are installed or cached.
