Back to skill

Security audit

Video Fetch — YouTube & Bilibili

Security checks across malware telemetry and agentic risk

Overview

The skill does what it claims: it fetches video subtitles or audio for YouTube/Bilibili and may use ElevenLabs or local Whisper to transcribe when subtitles are unavailable.

Install only if you are comfortable with the skill making network requests to video platforms and, when subtitles are missing, potentially sending downloaded audio to ElevenLabs for transcription. Use `--stt whisper` or `--stt none` for sensitive videos, store Bilibili cookies and API keys in protected files or environment variables, and use only proxies you trust.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Taint TrackingDirect Taint Flow, Variable-Mediated Taint Flow, Credential Exfiltration Chain
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
  • Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Findings (2)

Tainted flow: 'audio_url' from requests.get (line 107, network input) → requests.get (network output)

Medium
Category
Data Flow
Content
print(f"INFO: Downloading audio stream from Bilibili...", file=sys.stderr)
        dl_headers = _bili_headers(cookie)
        dl_headers["Referer"] = f"https://www.bilibili.com/video/{bvid}"
        r = requests.get(audio_url, headers=dl_headers, proxies=proxies, stream=True, timeout=(15, 300))
        r.raise_for_status()
        with open(m4s_path, "wb") as f:
            for chunk in r.iter_content(chunk_size=1024 * 64):
Confidence
78% confidence
Finding
r = requests.get(audio_url, headers=dl_headers, proxies=proxies, stream=True, timeout=(15, 300))

Missing User Warnings

Medium
Confidence
96% confidence
Finding
The default behavior uploads audio content to ElevenLabs without an explicit interactive warning or affirmative opt-in at the point of use. In an agent skill context, this increases privacy and data-governance risk because users may believe the tool only fetches transcripts/metadata while it can transmit full media-derived content to a third party.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal