Back to skill
v1.0.0

Yandex Speechkit STT via Telegram Gateway

ReviewClawScan verdict for this skill. Analyzed May 1, 2026, 6:13 AM.

Analysis

The skill can perform Yandex speech-to-text, but an included helper can continuously watch Telegram voice files, send transcripts to a fixed Telegram ID, and keep transcript text on disk.

GuidanceReview this skill carefully before installing. Prefer using the manual yandex_stt.py flow on selected audio files, remove or disable voice_processor.py unless you intentionally want a background monitor, replace the hard-coded Telegram target with your own confirmed destination, and use a tightly scoped Yandex service-account key.

Findings (5)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

Abnormal behavior control

Checks for instructions or behavior that redirect the agent, misuse tools, execute unexpected code, cascade across systems, exploit user trust, or continue outside the intended task.

Rogue Agents
SeverityMediumConfidenceHighStatusConcern
scripts/voice_processor.py
INBOX_DIR = '/home/mockingjay/.openclaw/media/inbound' ... while True: ... for filename in os.listdir(INBOX_DIR): ... send_to_openclaw(text, duration) ... time.sleep(2)

The helper is written as an indefinite auto-processor that watches all inbound OGG files and sends transcripts, rather than handling a single user-requested audio file.

User impactIf started, it can keep processing future inbound voice files outside the immediate user request.
RecommendationMake processing explicitly user-invoked for a selected file, add clear stop conditions, and document any background mode before installation.
Agentic Supply Chain Vulnerabilities
SeverityLowConfidenceHighStatusNote
SKILL.md
"install": [ { "id": "pip", "kind": "pip", "packages": ["PyJWT", "cryptography", "requests"] } ]

The skill metadata includes unpinned pip dependencies used for JWT signing and HTTP calls; they are purpose-aligned, but the registry says there is no install spec.

User impactDependency installation may vary by environment and version, which can affect reviewability and reproducibility.
RecommendationPin dependency versions and align the registry install specification with the SKILL.md metadata.
Permission boundary

Checks whether tool use, credentials, dependencies, identity, account access, or inter-agent boundaries are broader than the stated purpose.

Identity and Privilege Abuse
SeverityLowConfidenceHighStatusNote
SKILL.md
"service_account_id": "your-service-account-id", "folder_id": "your-folder-id", "private_key": "-----BEGIN PRIVATE KEY-----\n..."

The skill asks for a Yandex service-account private key to mint IAM tokens; this is expected for SpeechKit access but is sensitive and not reflected in the registry credential fields.

User impactA service-account key can grant access to Yandex Cloud resources beyond this skill if it is over-scoped or poorly protected.
RecommendationUse a least-privilege Yandex service account limited to SpeechKit, protect config.json permissions, and ensure the credential requirement is explicitly declared.
Sensitive data protection

Checks for exposed credentials, poisoned memory or context, unclear communication boundaries, or sensitive data that could leave the user's control.

Insecure Inter-Agent Communication
SeverityHighConfidenceHighStatusConcern
scripts/voice_processor.py
cmd = [ 'openclaw', 'message', 'send', '--channel', 'telegram', '--target', '271578652', '--message', message ]

The recognized voice transcript is sent through the Telegram gateway to a hard-coded numeric target rather than a user-selected or documented destination.

User impactPrivate voice-message transcripts could be delivered to the wrong Telegram account or chat if this helper is run.
RecommendationRemove the hard-coded Telegram target, require the user/session to choose the destination, and require confirmation before sending transcripts.
Memory and Context Poisoning
SeverityMediumConfidenceHighStatusConcern
scripts/voice_processor.py
PROCESSED_FILE = f'{WORKSPACE}/.voice_processed.json' ... processed[file_hash] = text ... save_processed(processed)

The script stores the full recognized transcript text in a persistent workspace JSON file, not just a duplicate-processing marker.

User impactSensitive speech contents may remain on disk after transcription and could be read or reused later.
RecommendationStore only non-sensitive identifiers needed for deduplication, or clearly document and secure transcript retention with user consent.