Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

Voice (Edge TTS)

v1.10.0

Convert text to speech using Microsoft Edge TTS with real-time streaming, customizable voice settings, and support for multiple languages including Chinese a...

2· 837·5 current·5 all-time
byzhaov@zhaov1976
Security Scan
VirusTotalVirusTotal
Suspicious
View report →
OpenClawOpenClaw
Suspicious
high confidence
Purpose & Capability
The skill's files and docs align with a Microsoft Edge TTS streaming tool (requires edge-tts and ffmpeg). However, package.json/lock also list an npm 'edge-tts' dependency while the implementation calls the Python CLI (pip edge-tts). This mismatch is odd but plausibly a packaging oversight rather than malicious.
!
Instruction Scope
SKILL.md repeatedly asserts 'no shell execution' and a strict voice whitelist, but the code contradicts this: index.js builds and runs a concatenated command string via execAsync for general TTS and for installation, and does not apply the voice whitelist for the 'tts' action (whitelist only enforced for 'stream'). The 'play' action calls PowerShell -c with an interpolated file path string, which can be abused if a user-controlled filePath is provided. These inconsistencies increase the chance of command injection or unexpected execution.
Install Mechanism
There is no platform install spec, but the skill contains an installDependencies method that runs 'pip3 install edge-tts' at runtime (network fetch). package-lock.json shows an npm package resolved from a non-default mirror. Runtime installation and mixed packaging (Python CLI expected + npm dependency present) are moderate risk and should be reviewed.
Credentials
The skill requests no environment variables or credentials, which is proportional to a local TTS/playback tool. There are no declared secrets, though runtime pip/network access will occur if install is invoked.
Persistence & Privilege
The skill is not 'always: true' and does not request elevated platform persistence. It does create and clean a local temp directory under a relative path, which is expected behavior for temporary audio files.
Scan Findings in Context
[use_of_exec_with_concatenated_command] unexpected: index.js uses execAsync() with a command string built by joining arguments (edge-tts --text "..." ...). SKILL.md claims command injection protection and use of spawn instead — this is a direct contradiction and increases injection risk.
[unvalidated_user_input_in_cmd] unexpected: The textToSpeech path accepts user text and voice and inserts them into a shell command string without applying the documented voice whitelist (whitelist is only used for 'stream'). User-controlled text/voice could affect the constructed command.
[powershell_command_execution_with_interpolated_string] unexpected: playAudio uses spawn('powershell', ['-c', `(New-Object Media.SoundPlayer "${filePath}").PlaySync();`]) — passing an interpolated string to PowerShell -c can be dangerous if filePath is attacker-controlled (the 'play' action accepts a filePath parameter).
[hardcoded_ffmpeg_path_in_python_script] unexpected: stream_speak.py hardcodes FFMPEG_PATH to 'E:\tools\ffmpeg\bin\ffplay'. Hardcoded absolute paths reduce portability and may mask behavior on systems without that path; it's not necessary for legitimate cross-platform skill behavior.
[runtime_package_install_via_pip] expected: The skill runs 'pip3 install edge-tts' when installDependencies is invoked; network fetch of the edge-tts Python package is expected for this skill but it increases runtime risk and should be done deliberately in a controlled environment.
What to consider before installing
This skill appears to be a legitimate Edge TTS tool but the implementation contradicts its security claims. Before installing or enabling it: 1) Do not run it in a sensitive environment until code is audited. 2) Fix the command-execution issues: replace execAsync(string) with spawn/execFile and consistently apply the voice whitelist for all actions. 3) Sanitize and/or restrict inputs used in any command or PowerShell -c invocation (the 'play' action accepts an arbitrary filePath). 4) Remove or correct the hardcoded ffplay path in stream_speak.py and ensure ffmpeg/ffplay usage is documented and optional. 5) Prefer pre-installing Python deps (pip install edge-tts) in a controlled environment rather than allowing runtime pip installs, and verify the source of any npm/pip packages (the package-lock references a non-default mirror). If you are not comfortable reviewing or changing code, avoid installing this skill or run it in an isolated sandbox.

Like a lobster shell, security has layers — review code before you run it.

chinesevk9744qt39x7x058v1g2zedk1n981p7k1latestvk9744qt39x7x058v1g2zedk1n981p7k1securityvk9744qt39x7x058v1g2zedk1n981p7k1streamingvk9744qt39x7x058v1g2zedk1n981p7k1ttsvk9744qt39x7x058v1g2zedk1n981p7k1voicevk9744qt39x7x058v1g2zedk1n981p7k1
837downloads
2stars
2versions
Updated 14h ago
v1.10.0
MIT-0

Voice Skill (Edge TTS)

Text-to-speech skill using Microsoft Edge TTS engine with real-time streaming playback support.

Features 功能特点

  • Edge TTS Engine - High quality text-to-speech using Microsoft Edge
  • Streaming Playback - Real-time audio streaming (边生成边播放)
  • Multiple Voices - Support for Chinese, English, Japanese, Korean voices
  • Customizable - Adjust rate, volume, and pitch
  • Secure Implementation - No command injection vulnerabilities

Installation 安装

1. Install Python dependencies

pip install edge-tts

2. Install ffmpeg (required for streaming)

Windows: Download from: https://github.com/GyanD/codexffmpeg/releases Extract and add bin folder to PATH

macOS:

brew install ffmpeg

Linux:

sudo apt install ffmpeg

Usage 使用

Streaming Playback (Recommended) 流式播放(推荐)

Real-time audio generation and playback:

// Basic usage
await skill.execute({
  action: 'stream',
  text: '你好,我是小九'
});

// With custom voice
await skill.execute({
  action: 'stream',
  text: 'Hello, how are you?',
  options: {
    voice: 'en-US-Standard-A',
    rate: '+10%',
    volume: '+0%',
    pitch: '+0Hz'
  }
});

Text-to-Speech with File 生成语音文件

await skill.execute({
  action: 'tts',
  text: 'Hello, how are you today?',
  options: {
    voice: 'zh-CN-XiaoxiaoNeural'
  }
});
// Returns: { success: true, media: 'MEDIA: /path/to/file.mp3' }

Direct Speak 直接播放

await skill.execute({
  action: 'speak',
  text: 'Hello!'
});

List Available Voices 查看可用语音

await skill.execute({
  action: 'voices'
});

Available Voices 可用语音

LanguageVoice ID
Chinese (Female)zh-CN-XiaoxiaoNeural
Chinese (Male)zh-CN-YunxiNeural
Chinese (Male)zh-CN-YunyangNeural
English (US Female)en-US-Standard-A
English (US Male)en-US-Standard-D
English (UK)en-GB-Standard-A
Japaneseja-JP-NanamiNeural
Koreanko-KR-SunHiNeural

Options 参数

OptionDefaultDescription
voicezh-CN-XiaoxiaoNeuralVoice ID
rate+0%Speech rate (-50% to +100%)
volume+0%Volume adjustment (-50% to +50%)
pitch+0HzPitch adjustment

Security 安全

This skill implements enterprise-grade security best practices:

🛡️ Security Features

FeatureImplementation
Input ValidationVoice parameter whitelist validation - only allowed voices can be used
No Shell ExecutionUses spawn() with array arguments instead of shell command concatenation
Command Injection PreventionAll user inputs are properly validated and escaped
Path SafetyFixed script path prevents path traversal

Security Details

// ❌ UNSAFE - Don't use exec with string concatenation
exec(`py script.py "${userText}" --voice ${userVoice}`);

// ✅ SAFE - Use spawn with array arguments
spawn('py', [scriptPath, text, '--voice', voice], { shell: false });

Voice Whitelist

Only these voices are allowed:

const allowedVoices = [
  'zh-CN-XiaoxiaoNeural', 'zh-CN-YunxiNeural', 'zh-CN-YunyangNeural',
  'zh-CN-YunyouNeural', 'zh-CN-XiaomoNeural',
  'en-US-Standard-C', 'en-US-Standard-D', 'en-US-Wavenet-F',
  'en-GB-Standard-A', 'en-GB-Wavenet-A',
  'ja-JP-NanamiNeural', 'ko-KR-SunHiNeural'
];

Any invalid voice parameter will be rejected and replaced with the default voice.

Changelog 更新日志

v1.10 (2026-02-24)

  • Enterprise-grade security - Full command injection protection
  • Voice whitelist validation
  • Replaced exec with spawn for secure process execution
  • Input sanitization for all parameters

v1.1.0

  • Add streaming playback support (边生成边播放)
  • Add ffmpeg dependency
  • Fix command injection vulnerability
  • Add voice whitelist validation

v1.0.0

  • Initial release with basic TTS support

Comments

Loading comments...