Back to skill

Security audit

Bilibili Transcriber

Security checks across malware telemetry and agentic risk

Overview

This is a real Bilibili transcription skill, but it needs review because its docs conflict about whether audio stays local and its cloud mode can send media and possibly an unrelated API key to Alibaba DashScope.

Install only if you are comfortable managing cloud transcription deliberately. Use local Whisper mode for sensitive or private videos, avoid exposing unrelated OPENAI_API_KEY credentials in the environment, and review any run that lacks official subtitles before allowing audio to be uploaded to DashScope.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Taint TrackingDirect Taint Flow, Variable-Mediated Taint Flow, Credential Exfiltration Chain
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Findings (7)

Tainted flow: 'request_body' from requests.get (line 148, network input) → requests.post (network output)

Medium
Category
Data Flow
Content
}

    print(f"[CloudASR] 提交转写任务 (URL模式, 模型: {model})...")
    resp = requests.post(submit_url, headers=headers, json=request_body, timeout=30)

    if resp.status_code != 200:
        raise Exception(f"提交任务失败 (HTTP {resp.status_code}): {resp.text}")
Confidence
83% confidence
Finding
resp = requests.post(submit_url, headers=headers, json=request_body, timeout=30)

Tp4

High
Category
MCP Tool Poisoning
Confidence
91% confidence
Finding
The documented purpose is narrower than the behavior described in the skill: it can fetch subtitles, send audio to a cloud provider, handle arbitrary audio inputs, and download/cache models from third-party mirrors, while claiming structured summarization that is not actually implemented. This mismatch weakens informed consent and security review because users and administrators may approve a seemingly limited Bilibili summarizer while actually granting broader data access and network behavior.

Intent-Code Divergence

Medium
Confidence
93% confidence
Finding
The README makes strong 'all local' and 'no external transcription service' claims near the top, but later documents an Alibaba DashScope/Paraformer cloud-transcription path. This is a security-relevant documentation inconsistency because users may make trust and data-handling decisions based on the false assumption that no audio/content ever leaves the host.

Context-Inappropriate Capability

Medium
Confidence
93% confidence
Finding
The skill metadata says it handles Bilibili video input, but this function accepts arbitrary public audio URLs and causes remote fetching/transcription outside that scope. That broadens the attack surface and can enable misuse of the skill as a generic URL-processing bridge to a third-party service.

Missing User Warnings

Medium
Confidence
97% confidence
Finding
The cloud-transcription section explains how to configure API keys but does not prominently warn that audio or transcript-relevant content will be sent to a third-party service when that mode is used. In a transcription skill, this omission can lead to accidental disclosure of sensitive media, especially if users rely on earlier 'local processing' claims.

Missing User Warnings

Medium
Confidence
95% confidence
Finding
The cloud transcription path uploads downloaded audio to a remote service but does not clearly warn users that media content leaves the local environment. This creates privacy, confidentiality, and compliance risk, especially if the audio contains personal data, copyrighted material, or sensitive internal content.

Missing User Warnings

Medium
Confidence
91% confidence
Finding
This module uploads local audio files to Alibaba Cloud and later retrieves transcript data, but the code provides no explicit consent or warning at the point of transmission. In a transcription skill, user content may contain sensitive information, so silent off-device transfer creates a meaningful privacy and compliance risk.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal

Static analysis

No suspicious patterns detected.