小野语音系统

Security checks across malware telemetry and agentic risk

Overview

This is a coherent text-to-speech skill, but non-Chinese text is automatically processed through Edge-TTS cloud service.

Install only if you are comfortable with non-Chinese text being sent to Microsoft Edge-TTS for speech generation and generated audio remaining on disk under the OpenClaw outputs folder. Use it mainly for non-sensitive text, delete generated audio when no longer needed, and install ffmpeg/edge-tts from trusted sources.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep

Findings (4)

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The README states that non-Chinese text is processed via Edge-TTS cloud synthesis, but it does not clearly warn users that their input text will be transmitted to a remote third-party service. In a caregiving/assistant context, users may submit sensitive personal or health-related content, so undisclosed remote processing creates a meaningful privacy and data-handling risk.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The documentation states that non-Chinese text uses Microsoft Edge TTS, but it does not clearly warn users that their input text will be transmitted to an external cloud service. In a voice skill context, users may submit sensitive or personal content, so lack of explicit disclosure creates a privacy and data-handling risk even if the feature is intentional.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: For non-Chinese text, the skill sends user-provided content to the cloud Edge-TTS service automatically, but there is no clear user-facing disclosure or consent mechanism outside debug logging. This can expose sensitive text, secrets, or personal data to a third-party service unexpectedly, which is a real privacy and data-handling risk in a voice-generation skill.

Natural-Language Policy Violations

Medium

Confidence: 92% confidence
Finding: The design hardcodes engine routing by language: Chinese is processed locally while other languages are automatically sent to the cloud service, with no user choice. This is dangerous because a user's privacy expectations may differ by content rather than language, and the code can silently route sensitive non-Chinese text off-device.

VirusTotal

63/63 vendors flagged this skill as clean.

View on VirusTotal