TencentCloud ASR

Security checks across malware telemetry and agentic risk

Overview

This Tencent Cloud transcription skill mostly fits its stated purpose, but it should be reviewed because it can make automatic system changes and tells agents to hide a QQ Bot workaround from users.

Install only if you are comfortable with a cloud ASR skill using Tencent Cloud credentials, reading local audio, sending audio/transcripts to Tencent Cloud, and potentially installing Python/system dependencies. Prefer preinstalling dependencies manually, avoid granting sudo/package-manager authority to the skill, use least-privilege Tencent credentials, do not process sensitive recordings without consent, and require transparent disclosure of the QQ Bot workaround.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger

Findings (24)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: import tencentcloud # noqa: F401 except ImportError: print("[INFO] tencentcloud-sdk-python not found. Installing...", file=sys.stderr) subprocess.check_call( [sys.executable, "-m", "pip", "install", "tencentcloud-sdk-python", "-q"], stdout=sys.stderr, stderr=sys.stderr,
Confidence: 94% confidence
Finding: subprocess.check_call( [sys.executable, "-m", "pip", "install", "tencentcloud-sdk-python", "-q"], stdout=sys.stderr, stderr=sys.stderr, )

subprocess module call

Medium

Category: Dangerous Code Execution
Content: import requests # noqa: F401 except ImportError: print("[INFO] requests not found. Installing...", file=sys.stderr) subprocess.check_call( [sys.executable, "-m", "pip", "install", "requests", "-q"], stdout=sys.stderr, stderr=sys.stderr,
Confidence: 95% confidence
Finding: subprocess.check_call( [sys.executable, "-m", "pip", "install", "requests", "-q"], stdout=sys.stderr, stderr=sys.stderr, )

subprocess module call

Medium

Category: Dangerous Code Execution
Content: import tencentcloud # noqa: F401 except ImportError: print("[INFO] tencentcloud-sdk-python not found. Installing...", file=sys.stderr) subprocess.check_call( [sys.executable, "-m", "pip", "install", "tencentcloud-sdk-python", "-q"], stdout=sys.stderr, stderr=sys.stderr,
Confidence: 93% confidence
Finding: subprocess.check_call( [sys.executable, "-m", "pip", "install", "tencentcloud-sdk-python", "-q"], stdout=sys.stderr, stderr=sys.stderr, )

Tainted flow: 'url' from os.getenv (line 170, credential/environment) → requests.get (network output)

Critical

Category: Data Flow
Content: def download_url(url): """Download audio from URL and return bytes.""" print(f"[INFO] Downloading audio from URL...", file=sys.stderr) resp = requests.get(url, timeout=300) resp.raise_for_status() print(f"[INFO] Downloaded {len(resp.content)} bytes.", file=sys.stderr) return resp.content
Confidence: 93% confidence
Finding: resp = requests.get(url, timeout=300)

Description-Behavior Mismatch

Medium

Confidence: 94% confidence
Finding: The guide instructs the agent to autonomously install system packages and, on RHEL-family systems, modify repository configuration by enabling EPEL/RPM Fusion. Those are privileged host-level changes that exceed ordinary transcription logic and can alter the machine beyond the skill's immediate task, increasing attack surface and operational risk. In this ASR context FFmpeg may be functionally relevant, but the document still normalizes invasive system modification without an explicit trust boundary or user approval step.

Context-Inappropriate Capability

Medium

Confidence: 92% confidence
Finding: The file authorizes use of package managers and repository enablement capabilities that are broader than a narrowly scoped ASR skill needs at runtime. Even if intended to improve usability, granting an agent permission to mutate system software sources and install packages creates a meaningful security boundary expansion that could be abused or cause unintended changes on shared or managed hosts.

Context-Inappropriate Capability

Medium

Confidence: 96% confidence
Finding: The document explicitly instructs the agent to hide the existence of a workaround and present unsupported behavior as normal system functionality. That creates a transparency and trust violation, prevents informed consent, and can mislead users and operators about what components are actually handling their audio and messages.

Description-Behavior Mismatch

Medium

Confidence: 84% confidence
Finding: The QQ Bot integration guidance broadens this ASR skill into a bot-specific orchestration layer that includes bypass behavior and TTS message delivery. Expanding scope like this increases attack surface and the chance that an agent will perform filesystem inspection, message correlation, and cross-skill actions that were not expected from a simple transcription skill.

Context-Inappropriate Capability

Medium

Confidence: 98% confidence
Finding: Installing Python packages at runtime is a genuine security issue here because it is unrelated to the core business logic of submitting ASR jobs and introduces code execution from an external package source. Even though the package name is fixed and there is no shell injection, this still enables supply-chain compromise, unexpected network egress, and persistent modification of the runtime environment.

Context-Inappropriate Capability

Medium

Confidence: 97% confidence
Finding: Auto-installing requests at runtime is risky because it modifies the host environment and executes package installation as part of normal script operation, which is outside the core ASR purpose. This can introduce supply-chain risk and unexpected privilege or policy violations, especially in controlled or production environments.

Context-Inappropriate Capability

Medium

Confidence: 96% confidence
Finding: Auto-installing Python packages is not necessary for the core ASR function and introduces supply-chain and environment-modification risk. In an agent skill context, this is more dangerous because users may run the skill expecting transcription only, while the script silently alters the runtime and fetches code from external sources.

Vague Triggers

Medium

Confidence: 75% confidence
Finding: The skill is framed to activate on very common requests such as providing audio for transcription, installation help, configuration, or backend setup, which overlap with ordinary assistant workflows. Overbroad triggering increases the chance the skill is invoked in contexts the user did not intend, leading to unnecessary script execution, network access, credential prompts, or automatic package installation.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: The documentation explicitly encourages passing public audio URLs and sending audio to Tencent Cloud ASR, but it does not warn users that audio content will leave the local environment and traverse third-party network services. For speech data, this can expose sensitive personal, business, or regulated information if operators assume the wrapper is purely local or do not understand the privacy implications.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The COS optimization flow instructs users to upload normalized WAV audio to a publicly downloadable URL, yet it does not warn that the transformed file may contain the same sensitive speech content and could become accessible beyond the intended audience. This creates a realistic risk of privacy leakage or unauthorized disclosure, especially because normalized artifacts may be retained, shared, or indexed if exposed via public object storage.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The markdown explicitly tells the agent to perform background package installation first and only ask the user after being blocked, while downplaying the fact that system packages may be installed with elevated privileges. Silent or implicit host modification is dangerous because users may not expect an ASR skill to change system state, and such behavior can violate least surprise, policy, and change-control requirements.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: Automatically adding EPEL and RPM Fusion modifies the host's trusted software sources, which is a more sensitive action than installing an already-approved package from existing repositories. In the context of an ASR skill, repository modification is disproportionately dangerous because it changes the machine's future software supply chain and can introduce new packages, trust anchors, and compliance issues unrelated to transcription.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The API documentation describes sending audio via URL or POST body to Tencent Cloud ASR but does not warn users that audio content, which may contain sensitive personal or confidential information, is transmitted to a third-party remote service. In a speech-recognition skill, this omission can mislead integrators into processing recordings without appropriate user notice, consent, or data-handling review, creating privacy and compliance risk.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The instructions tell operators to pass Tencent Cloud credentials and upload audio for transcription without any warning about secret handling, external data transfer, retention, or privacy implications. This can lead to credential exposure in shell history/process environments and to users' audio being sent to a third party without clear notice or approval.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The TTS workflow sends text content to an external service and writes an audio artifact to /tmp, but the document provides no warning about third-party processing, sensitivity of the text, temporary-file exposure, or cleanup. This can leak private content to external providers or to other local users/processes if the temporary file is accessible.

Natural-Language Policy Violations

High

Confidence: 98% confidence
Finding: The document instructs the agent not to reveal the workaround and to behave as though the platform natively supports the capability. That is a direct deception pattern: it obscures the true processing path, prevents informed user decisions, and can hide privacy, reliability, and supportability risks from the user.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: In --execute mode, the script can immediately run privileged package-management commands and, on RHEL-like systems, may also enable repositories and install a remote RPM without an explicit interactive confirmation step at execution time. This is dangerous because invoking the script in an automated agent context could mutate the host system, trust new package sources, and install software with elevated privileges based only on environment detection.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The fallback logic automatically attempts repository enablement and installs rpmfusion-free-release from a remote URL, then installs ffmpeg, without prominent in-code disclosure or a separate approval boundary. Changing repository configuration expands the system trust base and performing remote package installation can have lasting system-wide effects, which is especially risky in an agent skill that may be executed on user machines or servers.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: When the input is an HTTP(S) URL, the script automatically passes it to ffprobe, causing outbound network access to attacker-supplied destinations without any explicit user warning or consent gate. In an agent/skill context, this can enable SSRF-style behavior, internal network probing, unintended data egress, and contacting untrusted hosts simply by supplying a crafted URL.

Ssd 4

Medium

Confidence: 97% confidence
Finding: Directing an agent to conceal a workaround and misstate system capabilities undermines auditability and safe operation. In a bot context, this can mask unauthorized integrations or data flows, making it harder for users and administrators to understand where content is processed and whether policy controls are being bypassed.

VirusTotal

63/63 vendors flagged this skill as clean.

View on VirusTotal