feishu-whisper-voice

Security checks across malware telemetry and agentic risk

Overview

This skill is a coherent Feishu voice transcription and TTS helper, but users should treat audio transcripts, package installation, and local files as sensitive.

Install this only where the Feishu bot or authorized user should access voice messages. Prefer an isolated virtual environment, preinstall pinned dependencies instead of relying on runtime pip installs, avoid third-party mirrors unless you trust them, and delete local transcript files or caches after processing sensitive audio.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Behavioral ASTexec() Call, eval() Call, Dynamic Import

Findings (10)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: except ImportError: print("❌ faster-whisper 未安装，正在尝试安装...") import subprocess subprocess.check_call([sys.executable, "-m", "pip", "install", "faster-whisper"]) # 重试 from faster_whisper import WhisperModel
Confidence: 93% confidence
Finding: subprocess.check_call([sys.executable, "-m", "pip", "install", "faster-whisper"])

subprocess module call

Medium

Category: Dangerous Code Execution
Content: try: # 安装依赖（如果未安装） subprocess.run([sys.executable, "-m", "pip", "install", "faster-whisper", "--quiet"], check=True) from faster_whisper import WhisperModel
Confidence: 94% confidence
Finding: subprocess.run([sys.executable, "-m", "pip", "install", "faster-whisper", "--quiet"], check=True)

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: A transcription utility should primarily read audio and produce text, but this code also installs software dynamically when imports fail. That added execution path increases attack surface because package installation can trigger network access and execution of dependency code, which is risky in an agent skill context where users may not expect environment modification.

Context-Inappropriate Capability

Medium

Confidence: 97% confidence
Finding: A transcription utility should not need to gain package-install capability during normal operation, because that enables retrieval and execution of unreviewed code from external package sources. In this context, the runtime pip install is especially risky because the script immediately imports and executes the newly installed dependency, turning a simple offline processing task into a supply-chain execution path.

Vague Triggers

Medium

Confidence: 90% confidence
Finding: The trigger conditions are overly broad and include generic conversational cues like talking/speaking-related terms, which can cause the skill to activate on unintended user messages. In a voice-processing skill, accidental invocation is risky because it may lead to downloading, transcribing, and replying to user audio without a clearly intentional opt-in for that specific capability.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The skill describes a workflow that downloads user audio, performs speech-to-text, and generates a spoken reply, but it does not disclose retention, transmission, or processing boundaries for potentially sensitive voice data. Voice content can contain biometric, personal, or confidential information, so lack of notice and data-handling constraints creates privacy and compliance risk.

Missing User Warnings

Low

Confidence: 86% confidence
Finding: The documentation recommends setting HF_ENDPOINT to a third-party mirror without warning that model download traffic and related metadata will be redirected to an external service. This can expose environment/network metadata and introduces supply-chain and trust concerns if the mirror is compromised or not approved.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The script silently appends export lines to `~/.bashrc` and `~/.zshrc`, creating persistent changes to the user's shell configuration. Even though it writes only a placeholder value here, modifying startup files without a clear warning and informed consent is risky behavior because it normalizes persistence mechanisms and can break user environments or be repurposed for more harmful payloads.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The code automatically installs a package without prior confirmation, making a security-sensitive environment change on behalf of the user. In an agent or automation setting, this is dangerous because it may bypass organizational controls, create non-reproducible environments, and perform unexpected network operations without informed consent.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The script writes transcribed speech content to /tmp/voice_result.txt without warning or consent, which can expose sensitive data because transcriptions often contain private conversations, credentials, or personal information. Temporary directories are commonly shared, monitored, or left behind longer than expected, increasing the chance of unintended disclosure to other local users or processes.

VirusTotal

61/61 vendors flagged this skill as clean.

View on VirusTotal