audio-transcribe-summarize

Security checks across malware telemetry and agentic risk

Overview

This skill does what it advertises: it uploads user-chosen audio or video to SenseAudio for transcription, then saves transcript outputs locally.

Install only if you are comfortable sending selected recordings to SenseAudio under that provider's terms. Avoid confidential or regulated recordings unless you have approval, use a revocable API key, and expect local transcript and raw JSON files to be written next to the source or chosen output path.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Taint TrackingDirect Taint Flow, Variable-Mediated Taint Flow, Credential Exfiltration Chain
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (5)

Tainted flow: 'files' from open (line 148, file read) → requests.post (network output)

High

Category: Data Flow
Content: else: data["timestamp_granularities[]"] = args.timestamps response = requests.post(API_URL, headers=headers, files=files, data=data, timeout=300) if response.status_code != 200: print(f"API Error ({response.status_code}): {response.text}")
Confidence: 96% confidence
Finding: response = requests.post(API_URL, headers=headers, files=files, data=data, timeout=300)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 95% confidence
Finding: The skill invokes capabilities that materially affect user security and privacy—reading environment secrets, writing files, making network requests to a third-party API, and using shell tooling—without declaring them. This reduces transparency and prevents informed consent or policy enforcement around sensitive operations, especially because audio content and API keys are involved.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 91% confidence
Finding: The documented behavior understates what the skill can do and omits important processing such as translation, diarization, sentiment analysis, and raw response storage. This mismatch can mislead users and reviewers about the scope of data processing and retention, creating privacy and governance risks rather than a direct exploit primitive.

Missing User Warnings

Medium

Confidence: 98% confidence
Finding: The skill does not clearly warn that user audio/video content is sent to an external service for processing. Because recordings may contain sensitive personal, business, or regulated information, lack of disclosure undermines user consent and can lead to privacy, compliance, and data-handling violations.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The code uploads audio to a remote service without presenting a just-in-time privacy warning at the moment of transmission. Because this tool is specifically for transcribing recordings, users may reasonably invoke it on meetings, interviews, or lectures containing sensitive speech, making the lack of an explicit runtime warning more significant than in a clearly cloud-only workflow.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal