Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

小米 MiMo TTS

Text-to-speech using Xiaomi MiMo TTS API. Generates WAV audio files. Triggers when user says "send voice message", "voice reply", "read to me", "use clip voi...

MIT-0 · Free to use, modify, and redistribute. No attribution required.
0 · 110 · 2 current installs · 2 all-time installs
MIT-0
Security Scan
VirusTotalVirusTotal
Suspicious
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
The skill's name, description, SKILL.md examples, and scripts all match a Xiaomi MiMo TTS integration (calls a mimo-v2-tts model at api.xiaomimimo.com). That functionality is coherent with the stated purpose.
Instruction Scope
Runtime instructions and examples are within the TTS use-case. However the runtime script attempts to read ~/.openclaw/config.json to find an API key if the MIMO_API_KEY env var is not set — this file access is not declared in the SKILL metadata and broadens the instruction scope to local configuration data.
Install Mechanism
No install spec or external downloads; the skill is instruction-only plus a small helper script. No installer or archive URLs are present.
!
Credentials
The manifest lists no required environment variables but the code clearly requires MIMO_API_KEY (and also looks for mimo_api_key / MIMO_API_KEY inside ~/.openclaw/config.json). Requiring the API key is reasonable for a TTS skill, but omitting it from the declared requirements and silently reading a local config file is disproportionate and should be documented.
Persistence & Privilege
The skill does not request always:true, does not modify other skills or system-wide settings, and does not install persistent components. It will make outbound requests to the declared API endpoint as part of normal operation.
What to consider before installing
This package appears to implement the described MiMo TTS functionality, but there are two mismatches you should consider before installing: (1) the script requires an API key (MIMO_API_KEY) but the skill metadata declares no required env vars — ask the publisher to add MIMO_API_KEY to requires.env so it's explicit; (2) the script will try to read ~/.openclaw/config.json to find the key if the env var is not set — verify you are comfortable with the skill reading that config file (it loads the whole JSON). Additional checks: confirm the API endpoint (https://api.xiaomimimo.com) is legitimate for the provider, inspect the code yourself (it’s short) to ensure it only sends the text you provide and the key, and prefer providing the API key explicitly in a controlled place rather than leaving it in shared config. If the publisher updates the manifest to declare the required env var and documents the config-file lookup, and you verify the remote endpoint is legitimate, the mismatch concern would be resolved.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.0.2
Download zip
latestvk97dwaz2tfcr2bqfypsf6qe01n836s37

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

SKILL.md

Xiaomi MiMo TTS

Quick Usage

Just say "send voice" + what you want me to say, or describe the voice style you want.

Default Config

  • Default Voice: default_zh (Chinese female)
  • Default Style: <style>夹子音</style> (cute/clip voice, used when no style specified)

Available Voices

Voice Namevoice parameter
MiMo-Defaultmimo_default
MiMo-Chinese-Femaledefault_zh
MiMo-English-Femaledefault_eh

Style Control

Overall Style (at the beginning of text)

Style TypeExamples
Speed Control变快 (faster) / 变慢 (slower)
Emotion开心 (happy) / 悲伤 (sad) / 生气 (angry)
Character孙悟空 (Wukong) / 林黛玉 (Lin Daiyu)
Style Variations悄悄话 (whisper) / 夹子音 (clip voice) / 台湾腔 (Taiwanese accent)
Dialect东北话 (Northeast) / 四川话 (Sichuan) / 河南话 (Henan) / 粤语 (Cantonese)

Format: <style>style1 style2</style>text to synthesize

Audio Tags (Fine-grained Control)

Use () to annotate emotion, speed, pauses, breathing, etc:

TagDescriptionExample
(紧张,深呼吸)Multi-emotion combo(紧张,深呼吸)呼……冷静,冷静
(语速加快)Speed change(语速加快,碎碎念)
(小声)Volume control(小声)哎呀,领带歪没歪?
(长叹一口气)Sigh(长叹一口气)
(咳嗽)Cough(咳嗽)简直能把人骨头冻透了
(沉默片刻)Pause(沉默片刻)
(苦笑)Bitter smile(苦笑)呵,没如果了
(提高音量喊话)Loud shout(提高音量喊话)大姐!这鱼新鲜着呢!
(极其疲惫,有气无力)Exhausted师傅……到地方了叫我一声……
(寒冷导致的急促呼吸)Environmental呼——呼——这、这大兴安岭的雪……

Synthesis Example:

import os
import base64
from openai import OpenAI

client = OpenAI(
    api_key=os.environ.get("MIMO_API_KEY"),
    base_url="https://api.xiaomimimo.com/v1"
)

# Clip voice style
text = "<style>夹子音</style>主人~我来啦!今天有什么需要帮忙的吗~"

completion = client.chat.completions.create(
    model="mimo-v2-tts",
    messages=[
        {"role": "user", "content": "你好"},
        {"role": "assistant", "content": text}
    ],
    audio={"format": "wav", "voice": "default_zh"}
)

audio_bytes = base64.b64decode(completion.choices[0].message.audio.data)
with open("output.wav", "wb") as f:
    f.write(audio_bytes)

Notes

  • Target text must be in the assistant role message, not in user
  • <style> tag must be at the beginning of target text
  • For singing: <style>唱歌</style>target text
  • Returns base64-encoded WAV audio

Script Usage

Use scripts/mimo_tts.py for speech synthesis:

MIMO_API_KEY=your_api_key python3 scripts/mimo_tts.py "text to synthesize" --voice default_zh --style "夹子音" --output output.wav

Note: Set MIMO_API_KEY environment variable or configure in OpenClaw settings.

Files

2 total
Select a file
Select a file to preview.

Comments

Loading comments…