Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

Mimo Tts Asr

v2.5.4

Xiaomi MiMo V2.5 TTS + ASR 全能语音技能。支持高质量中英文语音合成(TTS)和语音识别(ASR)。 TTS: 三款模型(精品音色 / VoiceDesign 音色设计 / VoiceClone 音色克隆)、方言支持、情感控制、多格式输出。 ASR: 音频转文字、多语言识别、方言、Code...

0· 64·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for aaroncxxx/mimo-tts-asr.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "Mimo Tts Asr" (aaroncxxx/mimo-tts-asr) from ClawHub.
Skill page: https://clawhub.ai/aaroncxxx/mimo-tts-asr
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install mimo-tts-asr

ClawHub CLI

Package manager switcher

npx clawhub@latest install mimo-tts-asr
Security Scan
Capability signals
Requires sensitive credentials
These labels describe what authority the skill may exercise. They are separate from suspicious or malicious moderation verdicts.
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
The name/description match the included scripts and behavior: TTS and ASR functionality is implemented, with support for voice-design and voice-clone. However the registry lists no required credentials while both SKILL.md and the scripts clearly require API keys (MIMO_API_KEY / MIMO_ASR_KEY). Also the package author/owner in metadata (aaroncxxx / kn75hrty...) does not match an official Xiaomi homepage; the SKILL.md links point to Xiaomi domains, but the skill's provenance is not verified.
Instruction Scope
Runtime instructions and the included scripts are narrowly scoped to reading audio/text and calling MiMo APIs (or offering local open-source model use). They do not attempt to read unrelated system files or environment variables beyond the service keys. Important: both ASR and TTS implementations upload audio (including reference audio for voice-clone encoded in base64) to external endpoints (api.xiaomimimo.com/platform.xiaomimimo.com). Users should expect audio and reference clips to be transmitted to that service.
Install Mechanism
This is an instruction-only skill with two small Python scripts included and no install spec or external downloads. No archive downloads, package installs, or post-install scripts are present — low filesystem/install risk.
!
Credentials
The skill requires API keys (MIMO_API_KEY and/or MIMO_ASR_KEY) to call the vendor's APIs, which is reasonable for a cloud TTS/ASR integration. The concern is that the registry metadata did not declare any required env vars or a primary credential, creating a mismatch between claims and actual needs. Because audio (and reference audio for cloning) will be uploaded, supplying keys grants the skill/network access tied to your account — confirm scopes, billing, and data-retention policies before providing keys.
Persistence & Privilege
The skill does not request always:true and does not attempt to modify other skills or system-wide settings. It relies on agent invocation as normal. No elevated persistence or privileged system access is requested.
What to consider before installing
This skill appears to implement the advertised MiMo TTS and ASR features, but there are a few things to check before installing: - Provenance: The SKILL.md links to Xiaomi/MiMo domains, but the package author/owner in metadata is not clearly an official Xiaomi account. Verify the skill's source (official repo or vendor) before trusting it. - API keys: The scripts require MIMO_API_KEY and/or MIMO_ASR_KEY, but the registry metadata lists no required env vars. Expect to provide an API key if you want cloud calls to work. Do not supply a high-privilege or unrelated credential; create a dedicated key with minimal scopes if possible. - Privacy: Using the cloud API uploads audio (and any reference audio used for voice-clone) to api.xiaomimimo.com. If audio contains sensitive or personal data, consider running the open-source local ASR model instead or avoid uploading sensitive recordings. - Voice cloning legality/consent: Voice-clone will upload reference audio (encoded in the request). Ensure you have consent to clone any person's voice. - Practical step: Inspect the included scripts (they are small and readable) and test the skill in a sandboxed environment with a throwaway API key or using local open-source models before enabling it for real data. If you need higher assurance, ask the publisher for a homepage or official vendor verification and request that required env vars be declared in the registry metadata.

Like a lobster shell, security has layers — review code before you run it.

latestvk9753tmaja6xhp0gtgxm48qm1d85d97w
64downloads
0stars
2versions
Updated 4d ago
v2.5.4
MIT-0

Xiaomi MiMo-V2.5-TTS-Series + ASR — 你的声音,随心所"驭"

v2.5.4 · 面向 Agent 时代的全链路语音模型系列

官方资源 / Official Links

资源链接
📖 发布公告MiMo-V2.5-TTS-Series + ASR 正式发布
📚 TTS API 文档语音合成(MiMo-V2.5-TTS 系列)
📚 ASR API 文档音频理解
🎮 MiMo Studio 体验aistudio.xiaomimimo.com/#/c
🔧 官方 Skill 仓库github.com/XiaomiMiMo/MiMo-Skills
🤗 ASR 开源代码github.com/XiaomiMiMo/MiMo-V2.5-ASR
🤗 ASR 模型权重huggingface.co/XiaomiMiMo/MiMo-V2.5-ASR
🤗 ASR Demohuggingface.co/spaces/XiaomiMiMo/MiMo-V2.5-ASR
📋 定价与限速定价说明
🌐 MiMo 开放平台platform.xiaomimimo.com

功能概览 / Overview

TTS — 三款模型

模型能力场景
🎙️ MiMo-V2.5-TTS内置精品音色,语速/情绪/语气精细控制通用语音合成
🎨 MiMo-V2.5-TTS-VoiceDesign自然语言描述从零生成新音色(无需参考音频)游戏NPC/虚拟主播/品牌IP
🔁 MiMo-V2.5-TTS-VoiceClone短音频高保真克隆音色(数秒即可)播客克隆/配音复刻

三款模型均已限时免费

ASR — 语音识别

能力说明
🌍 中英双语自由切换,无需预设语种
🗣️ 中文方言吴语/粤语/闽南语/四川话
🔀 Code-Switch中英混杂自然转录
🎵 歌曲识别中英文歌词,伴奏场景高精度
🔊 强噪音高噪音/远场拾音鲁棒识别
👥 多说话人会议等多人交叉对话
📝 原生标点结合韵律与语义自动标点

🆓 ASR 已开源GitHub / HuggingFace


⚙️ 配置 / Setup

环境变量

# TTS API Key(独立于模型推理 Key)
export MIMO_API_KEY="your-tts-api-key"

# ASR API Key(如与 TTS 相同可复用)
export MIMO_ASR_KEY="your-asr-api-key"

或通过 OpenClaw 配置:

openclaw config set skills.entries.mimo-tts-asr.apiKey "your-key"

⚠️ TTS/ASR 的 API Key 独立于模型推理 Key,需前往 platform.xiaomimimo.com 申请。


🎙️ TTS — 语音合成

基础用法

python3 "{baseDir}/scripts/tts.py" "要合成的文本" -o output.wav

参数说明

参数默认值说明
text(必填)要合成的文本
-ooutput.wav输出文件路径
-mtts模型:tts / voice-design / voice-clone
-vmimo_default音色(见音色列表)
-s风格标签
-fwav音频格式:wav / mp3 / ogg
--voice-descVoiceDesign:音色描述文本
--ref-audioVoiceClone:参考音频路径
--user-msg用户角色上下文(调整语气)
--api-key环境变量API Key 覆盖
--max-retries3最大重试次数
--list-voices列出可用音色
--list-formats列出可用格式

音色列表(MiMo-V2.5-TTS)

名称voice 参数说明
MiMo-默认mimo_default通用女声
MiMo-中文default_zh中文女声
MiMo-英文default_en英文女声
MiMo-男声mimo_male男声
MiMo-童声mimo_child童声
MiMo-粤语mimo_cantonese粤语
MiMo-四川话mimo_sichuan四川话

🎧 试听音色:MiMo Studio

风格标签

风格场景风格场景
可爱撒娇、软萌悲伤悲伤、失落
开心欢快、兴奋愤怒愤怒、激动
东北话方言、搞笑平静平静、舒缓
悄悄话神秘、低语惊讶惊讶、意外
孙悟空角色扮演变快/变慢语速控制
唱歌儿歌、旋律

可组合:-s "开心 变快" / -s "可爱 悄悄话" / -s "悲伤 变慢"

行内音频标签

在文本中插入精细控制: (停顿) (叹气) (笑声) (清嗓子) (耳语) (紧张) (小声) (语速加快) (深呼吸) (沉默片刻)

示例

# 基础合成
python3 "{baseDir}/scripts/tts.py" "你好,今天天气真好" -o hello.wav

# 方言
python3 "{baseDir}/scripts/tts.py" "哎呀妈呀,这天儿也忒冷了吧" -s "东北话" -o dongbei.wav

# 英文
python3 "{baseDir}/scripts/tts.py" "Hello, how are you?" -v default_en -o hello_en.wav

# 情感
python3 "{baseDir}/scripts/tts.py" "明天就是周五了,真开心!" -s "开心 变快" -o happy.wav

# 唱歌
python3 "{baseDir}/scripts/tts.py" "一闪一闪亮晶晶" -s "唱歌" -o sing.wav

# 男声 / 童声 / 方言
python3 "{baseDir}/scripts/tts.py" "大家好" -v mimo_male -o male.wav
python3 "{baseDir}/scripts/tts.py" "妈妈我要吃糖" -v mimo_child -o child.wav
python3 "{baseDir}/scripts/tts.py" "你好,今日天气好好" -v mimo_cantonese -o cantonese.wav
python3 "{baseDir}/scripts/tts.py" "这个火锅巴适得很" -v mimo_sichuan -o sichuan.wav

# MP3 / OGG
python3 "{baseDir}/scripts/tts.py" "测试" -f mp3 -o output.mp3
python3 "{baseDir}/scripts/tts.py" "测试" -f ogg -o output.ogg

# 🎨 VoiceDesign — 从描述生成新音色
python3 "{baseDir}/scripts/tts.py" "你好,欢迎来到我的世界" \
  -m voice-design \
  --voice-desc "一位年迈的东欧裔学者,低沉、略带嘶哑,说话节奏缓慢" \
  -o scholar.wav

python3 "{baseDir}/scripts/tts.py" "元气满满的一天开始啦!" \
  -m voice-design \
  --voice-desc "元气满满的少女,声线清脆,语尾带一点上扬" \
  -o genki.wav

# 🔁 VoiceClone — 用参考音频克隆音色
python3 "{baseDir}/scripts/tts.py" "这是克隆后的声音" \
  -m voice-clone \
  --ref-audio reference.wav \
  -o cloned.wav

🎧 ASR — 语音识别

API 调用

python3 "{baseDir}/scripts/asr.py" audio.wav
python3 "{baseDir}/scripts/asr.py" audio.mp3 -o transcript.txt
python3 "{baseDir}/scripts/asr.py" audio.wav --lang zh --format json

参数说明

参数默认值说明
audio(必填)音频文件路径(wav/mp3/ogg/m4a/flac)
-ostdout输出文件路径(默认打印到终端)
--langauto语言:auto / zh / en / ja / ko
--formattext输出格式:text / json / srt
--api-key环境变量API Key 覆盖
--max-retries3最大重试次数

输出格式

格式说明适用场景
text纯文本快速查看
json带时间戳和置信度程序处理
srtSRT 字幕格式视频字幕

本地部署(开源模型)

ASR 已开源,支持本地部署:

# 克隆仓库
git clone https://github.com/XiaomiMiMo/MiMo-V2.5-ASR.git
cd MiMo-V2.5-ASR

# 安装依赖
pip install -r requirements.txt

# 使用 HuggingFace 权重
python inference.py --audio audio.wav --output result.txt

📖 详细文档:github.com/XiaomiMiMo/MiMo-V2.5-ASR 🤗 在线体验:huggingface.co/spaces/XiaomiMiMo/MiMo-V2.5-ASR

示例

# 基础转录
python3 "{baseDir}/scripts/asr.py" recording.wav

# 保存到文件
python3 "{baseDir}/scripts/asr.py" meeting.mp3 -o meeting.txt

# 指定语言
python3 "{baseDir}/scripts/asr.py" english.mp3 --lang en

# JSON 格式(带时间戳)
python3 "{baseDir}/scripts/asr.py" audio.wav --format json

# SRT 字幕
python3 "{baseDir}/scripts/asr.py" video_audio.wav --format srt -o subtitles.srt

🔗 TTS + ASR 联合工作流

# 1. 先识别一段音频
python3 "{baseDir}/scripts/asr.py" input.wav -o transcript.txt

# 2. 修改文本后重新合成(用不同音色)
python3 "{baseDir}/scripts/tts.py" "$(cat transcript.txt)" -v mimo_male -o output.wav

# 3. 克隆音色后重新演绎
python3 "{baseDir}/scripts/tts.py" "$(cat transcript.txt)" \
  -m voice-clone --ref-audio original.wav -o cloned.wav

📋 交付

TTS 输出

MEDIA:output.wav

ASR 输出

直接回复转录文本,或保存到文件后回复路径。


故障排查

错误原因解决
401 Invalid API KeyKey 未配置或格式错误确认已配置 TTS/ASR 专用 Key
429 Too Many Requests触发限流等几秒后重试(脚本自动重试)
500 Server Error服务端异常稍后重试
文件不存在音频路径错误检查文件路径

📋 版本历史

v2.5.4 (2026-04-24)

  • ✨ 新增 VoiceDesign(音色设计)模型支持
  • ✨ 新增 VoiceClone(音色克隆)模型支持
  • ✨ 新增官方资源链接汇总
  • ✨ ASR 新增本地部署文档(开源模型)
  • 📚 文档优化:对齐官方发布说明

v2.5.2 (2026-04-24)

  • ✨ TTS + ASR 一体化
  • ✨ 7 种 TTS 音色 + 方言 + 情感控制
  • ✨ ASR 支持 auto/zh/en/ja/ko 多语言
  • ✨ ASR 输出格式:text / json / srt
  • ✨ 行内音频标签精细控制
  • ✨ MP3/OGG/WAV 多格式支持
  • ✨ 自动重试 + 限流处理

Comments

Loading comments...