Assembly Large Audio Transcriber

Security checks across malware telemetry and agentic risk

Overview

This appears to be a real AssemblyAI transcription skill, but it handles API keys and sensitive transcripts in ways users should review carefully before installing.

Install only if you are comfortable sending the audio to AssemblyAI and storing transcript outputs locally. Configure ASSEMBLYAI_API_KEY yourself through environment variables or a secrets mechanism, do not paste the key into chat, and avoid confidential, regulated, or privileged recordings unless local retention and third-party processing are approved.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (9)

Context-Inappropriate Capability

Medium

Confidence: 93% confidence
Finding: The documentation tells users they can provide their AssemblyAI API key to the agent for configuration, which encourages direct credential disclosure to the skill/operator. This expands the trust boundary unnecessarily and creates risk of secret retention, misuse, or leakage through logs, prompts, or memory.

Description-Behavior Mismatch

Medium

Confidence: 75% confidence
Finding: The workflow claims it will write raw transcripts to /workspace/memory, but the provided code examples do not implement that behavior. This discrepancy is security-relevant because reviewers and users cannot accurately assess when sensitive meeting content will be persisted.

Intent-Code Divergence

Medium

Confidence: 78% confidence
Finding: The inline workflow explicitly instructs persistent transcript archival to /workspace/memory/meetings even though no example code shows how or when that persistence occurs. Hidden or underspecified storage of meeting transcripts increases the risk of unexpected retention of sensitive conversations.

Description-Behavior Mismatch

Low

Confidence: 93% confidence
Finding: The script silently writes the full raw transcription response to a local JSON file, even though the skill description does not disclose this retention behavior. Because transcripts may contain sensitive audio-derived content, metadata, and speaker diarization details, this creates an avoidable data-at-rest exposure on the local system.

Missing User Warnings

High

Confidence: 95% confidence
Finding: Instructing users to share an API key with the agent, without a strong warning that it is a secret, normalizes unsafe credential handling. Users may expose paid-service credentials that could later be reused, leaked, or abused for unauthorized billing.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The skill centers on uploading local audio files to AssemblyAI over HTTP, but it does not clearly warn users that their audio content is transmitted to an external service. For large recordings and meetings, this may expose highly sensitive personal or business information without informed consent.

Missing User Warnings

Medium

Confidence: 84% confidence
Finding: The workflow says transcripts will be written to workspace storage but does not warn users about persistent local retention. Raw transcripts of meetings often contain confidential, personal, or privileged information that should not be stored by default.

Ssd 3

Medium

Confidence: 94% confidence
Finding: The skill explicitly asks users to disclose their AssemblyAI API key to the agent, which is unsafe secret-handling guidance. Credential collection is especially dangerous in agent contexts because prompts, memory, logs, and downstream tools may all increase exposure surfaces.

Ssd 3

Medium

Confidence: 86% confidence
Finding: The workflow instructs archival of raw meeting transcripts into persistent memory storage, which can retain sensitive spoken content long after processing is complete. In agent environments, workspace memory may be accessible to other components, users, or future sessions if not carefully isolated.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal