feishu-whisper-voice

Security checks across malware telemetry and agentic risk

Overview

This skill is a coherent Feishu voice transcription and TTS helper, but users should treat audio transcripts, package installation, and local files as sensitive.

Install this only where the Feishu bot or authorized user should access voice messages. Prefer an isolated virtual environment, preinstall pinned dependencies instead of relying on runtime pip installs, avoid third-party mirrors unless you trust them, and delete local transcript files or caches after processing sensitive audio.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • Behavioral ASTexec() Call, eval() Call, Dynamic Import
Findings (10)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
except ImportError:
        print("❌ faster-whisper 未安装,正在尝试安装...")
        import subprocess
        subprocess.check_call([sys.executable, "-m", "pip", "install", "faster-whisper"])
        
        # 重试
        from faster_whisper import WhisperModel
Confidence
93% confidence
Finding
subprocess.check_call([sys.executable, "-m", "pip", "install", "faster-whisper"])

subprocess module call

Medium
Category
Dangerous Code Execution
Content
try:
    # 安装依赖(如果未安装)
    subprocess.run([sys.executable, "-m", "pip", "install", "faster-whisper", "--quiet"], check=True)
    
    from faster_whisper import WhisperModel
Confidence
94% confidence
Finding
subprocess.run([sys.executable, "-m", "pip", "install", "faster-whisper", "--quiet"], check=True)

Context-Inappropriate Capability

Medium
Confidence
95% confidence
Finding
A transcription utility should primarily read audio and produce text, but this code also installs software dynamically when imports fail. That added execution path increases attack surface because package installation can trigger network access and execution of dependency code, which is risky in an agent skill context where users may not expect environment modification.

Context-Inappropriate Capability

Medium
Confidence
97% confidence
Finding
A transcription utility should not need to gain package-install capability during normal operation, because that enables retrieval and execution of unreviewed code from external package sources. In this context, the runtime pip install is especially risky because the script immediately imports and executes the newly installed dependency, turning a simple offline processing task into a supply-chain execution path.

Vague Triggers

Medium
Confidence
90% confidence
Finding
The trigger conditions are overly broad and include generic conversational cues like talking/speaking-related terms, which can cause the skill to activate on unintended user messages. In a voice-processing skill, accidental invocation is risky because it may lead to downloading, transcribing, and replying to user audio without a clearly intentional opt-in for that specific capability.

Missing User Warnings

Medium
Confidence
92% confidence
Finding
The skill describes a workflow that downloads user audio, performs speech-to-text, and generates a spoken reply, but it does not disclose retention, transmission, or processing boundaries for potentially sensitive voice data. Voice content can contain biometric, personal, or confidential information, so lack of notice and data-handling constraints creates privacy and compliance risk.

Missing User Warnings

Low
Confidence
86% confidence
Finding
The documentation recommends setting HF_ENDPOINT to a third-party mirror without warning that model download traffic and related metadata will be redirected to an external service. This can expose environment/network metadata and introduces supply-chain and trust concerns if the mirror is compromised or not approved.

Missing User Warnings

Medium
Confidence
88% confidence
Finding
The script silently appends export lines to `~/.bashrc` and `~/.zshrc`, creating persistent changes to the user's shell configuration. Even though it writes only a placeholder value here, modifying startup files without a clear warning and informed consent is risky behavior because it normalizes persistence mechanisms and can break user environments or be repurposed for more harmful payloads.

Missing User Warnings

Medium
Confidence
92% confidence
Finding
The code automatically installs a package without prior confirmation, making a security-sensitive environment change on behalf of the user. In an agent or automation setting, this is dangerous because it may bypass organizational controls, create non-reproducible environments, and perform unexpected network operations without informed consent.

Missing User Warnings

Medium
Confidence
96% confidence
Finding
The script writes transcribed speech content to /tmp/voice_result.txt without warning or consent, which can expose sensitive data because transcriptions often contain private conversations, credentials, or personal information. Temporary directories are commonly shared, monitored, or left behind longer than expected, increasing the chance of unintended disclosure to other local users or processes.

VirusTotal

61/61 vendors flagged this skill as clean.

View on VirusTotal