Local STT (Nvidia Parakeet + Whisper Support)
PassAudited by VirusTotal on May 12, 2026.
Overview
Type: OpenClaw Skill Name: local-stt Version: 1.0.0 The skill is designed for local speech-to-text and includes an optional feature to send transcriptions to a Matrix room. This involves reading `MATRIX_HOMESERVER` and `MATRIX_ACCESS_TOKEN` from environment variables (potentially from `~/.openclaw/.env` or `~/.env`) and making an outbound network request to a Matrix homeserver. This behavior, including the use of `ffmpeg` for audio conversion, is explicitly documented in `SKILL.md` and the `scripts/local-stt.py` docstring, and is aligned with the skill's stated purpose. There is no evidence of intentional harmful behavior, such as exfiltrating unrelated sensitive data, establishing persistence, or malicious prompt injection.
Findings (0)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
First use may fetch and run external packages needed for transcription, so the user relies on package registry provenance and current package versions.
The script is designed to run through uv and resolve Python dependencies at runtime, but the packages are not version-pinned in the artifact.
#!/usr/bin/env -S uv run --script # dependencies = [ # "onnx-asr", # "onnxruntime", # "huggingface_hub", # "click", # "requests", # ]
Declare uv as a requirement, pin dependency versions or include a lockfile, and document that models/packages may be downloaded on first use.
If configured, the skill can post transcriptions to Matrix rooms accessible to that token.
When Matrix delivery is used, the script uses a Matrix access token to act on the user's Matrix account, although registry metadata declares no primary credential or required environment variables.
homeserver = os.environ.get("MATRIX_HOMESERVER")
access_token = os.environ.get("MATRIX_ACCESS_TOKEN")
headers = {"Authorization": f"Bearer {access_token}"}Document the optional Matrix credential contract clearly, use the least-privileged token available, and only provide --room-id for rooms where posting transcripts is intended.
Audio content that may have been expected to stay local can leave the device and become visible in the selected Matrix room.
With --room-id, the transcribed audio text is sent to an external Matrix room via the Matrix REST API.
payload = {
'msgtype': 'm.text',
'body': f'🎙️ {text}',
...
}
resp = requests.put(url, headers=headers, json=payload, timeout=10)Use Matrix sending only intentionally, avoid sending sensitive audio transcripts to shared rooms, and consider adding an explicit confirmation or clearer documentation for this mode.
