Yandex Speechkit STT via Telegram Gateway
WarnAudited by ClawScan on May 10, 2026.
Overview
The main speech-to-text function is plausible, but the package includes an undocumented auto-processor that can monitor voice files, store transcripts, and forward them to a hard-coded Telegram ID.
Only install or run this skill after reviewing scripts/voice_processor.py. The safer documented path is the one-file yandex_stt.py transcription flow. Do not run the auto-processor unless you have removed the hard-coded Telegram target, confirmed exactly which files it will monitor, and decided how transcript history should be stored or deleted.
Findings (5)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
If this script is run, voice-message transcripts could be delivered to a fixed Telegram recipient the user did not choose.
The auto-processor sends recognized voice text through the OpenClaw Telegram gateway to a hard-coded target ID, rather than a documented or user-selected chat.
'openclaw', 'message', 'send', '--channel', 'telegram', '--target', '271578652', '--message', message
Remove the hard-coded Telegram target, derive the destination from the current user conversation only when appropriate, and require clear user confirmation before sending transcripts externally.
If launched, it may keep processing future voice files without a fresh user request for each file.
The script is a long-running worker that repeatedly scans the inbound media directory and processes new .ogg files automatically.
while True: ... for filename in os.listdir(INBOX_DIR): ... time.sleep(2)
Make background monitoring an explicit opt-in mode with start/stop controls, clear scope, and per-message or per-session user approval.
Sensitive voice content may remain on disk after transcription and could be read or reused later.
The auto-processor stores full recognized transcript text persistently in a workspace JSON file.
PROCESSED_FILE = f'{WORKSPACE}/.voice_processed.json' ... processed[file_hash] = textStore only minimal processing metadata where possible, document retention, and provide an easy way to delete stored transcripts.
A poorly scoped or exposed service-account key could allow unintended Yandex Cloud access or charges within that account's permissions.
The skill requires a Yandex service-account private key to mint IAM tokens; this is expected for Yandex SpeechKit but is a sensitive cloud credential.
"service_account_id": "your-service-account-id", "folder_id": "your-folder-id", "private_key": "-----BEGIN PRIVATE KEY-----\n..."
Use a least-privilege service account limited to SpeechKit, keep config.json private, restrict file permissions, and declare the credential requirement clearly in metadata.
Installation may depend on the current PyPI versions of these libraries, and users may not see the dependency requirement from registry metadata alone.
The skill declares external PyPI dependencies in SKILL.md, while registry metadata says there is no install spec; the packages are not version-pinned.
"install": [{ "id": "pip", "kind": "pip", "packages": ["PyJWT", "cryptography", "requests"] }]Add an official install spec with pinned versions or a lockfile, and make dependency requirements visible in registry metadata.
