Back to skill

Security audit

xiaomi-tts-chapters

Security checks across malware telemetry and agentic risk

Overview

The skill is mostly a disclosed Xiaomi MiMo text-to-speech converter, but it needs review because its shell wrapper can execute user-controlled command text and it can redirect API keys and chapter content to an arbitrary API endpoint.

Install only if you are comfortable sending the selected chapter text to an external TTS provider and you can restrict use to trusted chapter directories and a trusted API endpoint. Prefer invoking the Python script directly instead of run.sh, do not pass untrusted paths or option values, avoid custom --base-url values unless you control the server, and treat the API key as sensitive.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Output HandlingUnvalidated Output Injection, Cross-Context Output, Unbounded Output
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Findings (8)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
f.write(f"file '{safe_path}'\n")
            
            try:
                result = subprocess.run(
                    [
                        'ffmpeg', '-y', '-f', 'concat', '-safe', '0',
                        '-i', concat_list_path,
Confidence
88% confidence
Finding
result = subprocess.run( [ 'ffmpeg', '-y', '-f', 'concat', '-safe', '0', '-i', concat_list_path,

Lp3

Medium
Category
MCP Least Privilege
Confidence
90% confidence
Finding
The skill instructs users to read files, write MP3 outputs, access an API key from environment variables, and invoke shell commands such as pip, brew, and ffmpeg, but it does not declare corresponding permissions. This creates a transparency and governance gap: an agent may perform sensitive operations users or policy layers did not explicitly approve, increasing risk of unintended file access, command execution, or secret handling.

Context-Inappropriate Capability

Medium
Confidence
94% confidence
Finding
Allowing callers to override the API base URL means the user's API key and chapter content can be sent to any arbitrary endpoint, not just the expected Xiaomi MiMo service. In this skill context, that increases the risk of credential exfiltration and sensitive text leakage if a malicious or mistaken URL is supplied.

Vague Triggers

Medium
Confidence
89% confidence
Finding
The skill advertises broad, everyday-language trigger phrases such as '文字转语音' and 'TTS合成', which can cause the agent to invoke this skill in situations beyond the narrow intended use of chapter-to-audiobook conversion. Because the skill handles files and API-backed synthesis, overbroad activation increases the chance of unintended execution, surprise file processing, or accidental transmission of user content to an external TTS provider.

Vague Triggers

Medium
Confidence
87% confidence
Finding
The trigger phrases include broad everyday requests such as '文字转语音' and 'TTS合成', which can match many benign conversations and cause the skill to activate unexpectedly. Unintended activation is risky here because the skill can read local files, handle API keys, and launch tooling, so accidental invocation could lead to unnecessary data processing or command execution.

Missing User Warnings

Medium
Confidence
85% confidence
Finding
This code sends user-provided chapter text to an external Xiaomi MiMo TTS API, but the file contains no explicit disclosure, consent flow, or data-handling notice at the point of transmission. For a skill that may process full novels or user-authored text, this creates a real privacy and data-governance risk because sensitive or copyrighted content could be transmitted off-device unexpectedly.

Unvalidated Output Injection

High
Category
Output Handling
Content
f.write(f"file '{safe_path}'\n")
            
            try:
                result = subprocess.run(
                    [
                        'ffmpeg', '-y', '-f', 'concat', '-safe', '0',
                        '-i', concat_list_path,
Confidence
86% confidence
Finding
subprocess.run( [ 'ffmpeg', '-y', '-f', 'concat', '-safe', '0', '-i', concat_list_path, '-c', 'copy', # 直接拷

Unpinned Dependencies

Low
Category
Supply Chain
Content
openai>=1.0.0
Confidence
93% confidence
Finding
openai>=1.0.0

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal