Back to skill

Security audit

Web Video Transcribe DOCX

Security checks across malware telemetry and agentic risk

Overview

This is a coherent media transcription skill that downloads user-supplied or page-discovered media, transcribes it locally, and writes local transcript/DOCX outputs.

Install only if you are comfortable with a Python workflow that may install packages, load web pages in a local browser, download media/model files, and write transcripts and manifests to disk. Use it only for public or authorized media, avoid authenticated/private pages, and avoid passing Cookie or Authorization headers manually.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Findings (5)

Lp3

Medium
Category
MCP Least Privilege
Confidence
94% confidence
Finding
The skill clearly instructs execution of Python scripts that use network, filesystem, environment inspection, and shell-adjacent capabilities, but it does not declare corresponding permissions in metadata. This creates a transparency and policy-enforcement gap: an agent or marketplace may not warn users appropriately before the skill downloads content, writes files, or invokes local tooling.

Missing User Warnings

Low
Confidence
83% confidence
Finding
The workflow persistently downloads media and generates transcript and DOCX artifacts, but the description does not explicitly warn that files will be written to disk. This is a real safety/usability issue because users may unknowingly process sensitive media or leave residual data on shared systems.

Vague Triggers

Medium
Confidence
90% confidence
Finding
The default prompt is very broad and can cause the skill to be invoked for a wide range of web pages or media URLs without clear user-confirmed boundaries. In a security-sensitive workflow that downloads remote media and processes arbitrary web content, vague trigger conditions increase the chance of unsafe or unintended use, including fetching untrusted resources or over-applying the skill.

Missing User Warnings

Medium
Confidence
88% confidence
Finding
The script hooks Playwright request/response events and stores request headers associated with captured media URLs. Even though headers are passed through a sanitizer, this still collects potentially sensitive session-linked metadata such as Referer, Origin, or other identifying headers from arbitrary pages without any explicit notice, minimization policy, or consent prompt; in this skill context, the tool is specifically meant to extract playable media from arbitrary web pages, which increases the chance of processing authenticated or private media endpoints.

Missing User Warnings

Medium
Confidence
91% confidence
Finding
The code downloads a model archive from a remote URL and extracts it onto the local filesystem without verifying a checksum, signature, or pinned artifact digest. Although it performs a path traversal check during tar extraction, a compromised upstream release or man-in-the-middle in a less trusted environment could deliver a malicious model/archive and persist attacker-controlled files in the cache.

VirusTotal

67/67 vendors flagged this skill as clean.

View on VirusTotal