Assembly Large Audio Transcriber

Security checks across malware telemetry and agentic risk

Overview

This appears to be a real AssemblyAI transcription skill, but it handles API keys and sensitive transcripts in ways users should review carefully before installing.

Install only if you are comfortable sending the audio to AssemblyAI and storing transcript outputs locally. Configure ASSEMBLYAI_API_KEY yourself through environment variables or a secrets mechanism, do not paste the key into chat, and avoid confidential, regulated, or privileged recordings unless local retention and third-party processing are approved.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Findings (9)

Context-Inappropriate Capability

Medium
Confidence
93% confidence
Finding
The documentation tells users they can provide their AssemblyAI API key to the agent for configuration, which encourages direct credential disclosure to the skill/operator. This expands the trust boundary unnecessarily and creates risk of secret retention, misuse, or leakage through logs, prompts, or memory.

Description-Behavior Mismatch

Medium
Confidence
75% confidence
Finding
The workflow claims it will write raw transcripts to /workspace/memory, but the provided code examples do not implement that behavior. This discrepancy is security-relevant because reviewers and users cannot accurately assess when sensitive meeting content will be persisted.

Intent-Code Divergence

Medium
Confidence
78% confidence
Finding
The inline workflow explicitly instructs persistent transcript archival to /workspace/memory/meetings even though no example code shows how or when that persistence occurs. Hidden or underspecified storage of meeting transcripts increases the risk of unexpected retention of sensitive conversations.

Description-Behavior Mismatch

Low
Confidence
93% confidence
Finding
The script silently writes the full raw transcription response to a local JSON file, even though the skill description does not disclose this retention behavior. Because transcripts may contain sensitive audio-derived content, metadata, and speaker diarization details, this creates an avoidable data-at-rest exposure on the local system.

Missing User Warnings

High
Confidence
95% confidence
Finding
Instructing users to share an API key with the agent, without a strong warning that it is a secret, normalizes unsafe credential handling. Users may expose paid-service credentials that could later be reused, leaked, or abused for unauthorized billing.

Missing User Warnings

Medium
Confidence
88% confidence
Finding
The skill centers on uploading local audio files to AssemblyAI over HTTP, but it does not clearly warn users that their audio content is transmitted to an external service. For large recordings and meetings, this may expose highly sensitive personal or business information without informed consent.

Missing User Warnings

Medium
Confidence
84% confidence
Finding
The workflow says transcripts will be written to workspace storage but does not warn users about persistent local retention. Raw transcripts of meetings often contain confidential, personal, or privileged information that should not be stored by default.

Ssd 3

Medium
Confidence
94% confidence
Finding
The skill explicitly asks users to disclose their AssemblyAI API key to the agent, which is unsafe secret-handling guidance. Credential collection is especially dangerous in agent contexts because prompts, memory, logs, and downstream tools may all increase exposure surfaces.

Ssd 3

Medium
Confidence
86% confidence
Finding
The workflow instructs archival of raw meeting transcripts into persistent memory storage, which can retain sensitive spoken content long after processing is complete. In agent environments, workspace memory may be accessible to other components, users, or future sessions if not carefully isolated.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal