Voice For Openclaw Publish

v1.0.2

MiniMax TTS skill (enhanced). Multi-agent voice support (each agent can select a unique voice written in SOUL.md), native voice message for Telegram (MP3) an...

0· 161·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for vshen009/voice-for-openclaw.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "Voice For Openclaw Publish" (vshen009/voice-for-openclaw) from ClawHub.
Skill page: https://clawhub.ai/vshen009/voice-for-openclaw
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Required env vars: MINIMAX_API_KEY
Required binaries: python3, ffmpeg
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install voice-for-openclaw

ClawHub CLI

Package manager switcher

npx clawhub@latest install voice-for-openclaw
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description, required binaries (python3, ffmpeg), and required env (MINIMAX_API_KEY) match an API-driven TTS skill that can optionally send to Telegram. Optional TELEGRAM_* vars are explained as optional for message delivery.
Instruction Scope
Instructions and scripts only call the MiniMax TTS endpoint and (optionally) Telegram/Feishu delivery paths, read a local .env and write audio files under ~/.openclaw/workspace. They do export .env keys at runtime and show examples invoking 'openclaw message send' for Feishu; these are within the stated TTS/delivery scope but users should be aware audio files and credentials are read/stored locally.
Install Mechanism
No install spec or external downloads. The skill is instruction/code-only and relies on standard binaries (python3, ffmpeg) and Python requests — no arbitrary network installs or remote archives.
Credentials
Only the MiniMax API key is required; optional Telegram vars are clearly optional. No unrelated cloud credentials or broad-scoped secrets are requested.
Persistence & Privilege
always is false and the skill does not request elevated platform privileges or modify other skills. It writes audio files under the user's workspace (~/.openclaw/workspace) which is expected for generated content.
Assessment
This skill appears to do only TTS via MiniMax and optional channel delivery. Before installing: (1) keep MINIMAX_API_KEY and any TELEGRAM_BOT_TOKEN in the local .env (do not commit it); (2) review the scripts yourself (they are included) to confirm behavior; (3) note generated audio is written to ~/.openclaw/workspace/generated — delete files if they contain sensitive content; (4) if you will use Telegram sending, provide a bot token with only the required access and verify target chat IDs; (5) confirm the API endpoint (api.minimaxi.com) is acceptable for your data/privacy needs.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

🎙️ Clawdis
Binspython3, ffmpeg
EnvMINIMAX_API_KEY
Primary envMINIMAX_API_KEY
latestvk976qjy0f9tqc09gt2kedvg9nh83t05a
161downloads
0stars
3versions
Updated 1mo ago
v1.0.2
MIT-0

MiniMax TTS Plus

Multi-agent + multi-channel native voice message TTS skill.

Core Script

All operations go through tts-xiaoye.sh (TTS generation + channel delivery).

Quick Start

bash tts-xiaoye.sh "Text to speak"

Multi-Channel Usage

ChannelCommandFormatNotes
Telegramtts-xiaoye.sh "Text"MP3Direct send, no transcoding
Feishutts-xiaoye.sh --feishu "Text"OGG/OpusAuto-transcode to native voice bubble
Generate onlytts-xiaoye.sh --generate-only "Text"MP3Generate file without sending

Send Feishu Native Voice Message (Full Flow)

OPUS=$(bash tts-xiaoye.sh --feishu "Feishu voice content" 2>/dev/null | python3 -c "import sys,json; print(json.load(sys.stdin)['audio_file'])")
openclaw message send --channel feishu --account <YOUR_ACCOUNT_ID> --target <FeishuUserID> --media "$OPUS"

Multi-Agent Voice Configuration

Each agent can choose a unique voice and write it into their SOUL.md Voice Identity section:

## Voice Identity
- TTS model: speech-2.8-hd
- TTS voice: Chinese (Mandarin)_Warm_Girl
- TTS script: scripts/tts-xiaoye.sh

Recommended voices (verified):

Voice IDStyleUse Case
Chinese (Mandarin)_Warm_GirlWarm GirlPersonal Assistant
female-shaonvSweet GirlDefault / General
female-tianmeiSweet FemaleGentle style
male-qn-qingseYouthful MaleMale voice scenario
Chinese (Mandarin)_Sweet_LadySweet LadyFormal场合

List Available Voices

python3 tts-xiaoye.sh --list-voices
# or directly:
python3 scripts/tts.py --list-voices

This calls the MiniMax API and prints all available voices organized by category (System Voices, Cloned Voices, Generated Voices).

Available Models

ModelCharacteristic
speech-2.8-hdHighest quality (recommended)
speech-2.8-turboFaster, slightly lower quality

Full Parameters

tts-xiaoye.sh --text "Text" [--voice VoiceID] [--model Model] [--caption Caption]

Technical Notes

  • TTS outputs MP3 natively. Telegram sends directly via Bot API sendVoice (MP3 supported natively).
  • Feishu native voice messages require OGG/Opus format. FFmpeg handles conversion (~25ms per audio, negligible).
  • FFmpeg installation: brew install ffmpeg (Linuxbrew/macOS) or apt install ffmpeg (Linux).

Setup

  1. Copy setup.txt to .env and fill in your credentials:
cp skills/voice-for-openclaw/setup.txt skills/voice-for-openclaw/.env
# Then edit .env with your real values
  1. The script loads credentials from .env at runtime — no hardcoded tokens in scripts.

Required env vars:

VariableRequiredDescription
MINIMAX_API_KEY✅ YesMiniMax API secret key (from platform.minimax.io)
TELEGRAM_BOT_TOKEN❌ NoTelegram bot token — only needed for sending
TELEGRAM_TARGET❌ NoTelegram chat ID — only needed together with bot token

⚠️ Security note: Credentials are loaded from .env only — no tokens are hardcoded in shell scripts. The .env file is gitignored and never published.

💡 API endpoint: The TTS API uses https://api.minimaxi.com (MiniMax's official API server), which is separate from the developer portal at platform.minimax.io.


MiniMax TTS Plus(多语言增强版)

多 Agent + 多渠道原生语音条增强版 TTS 技能。

核心脚本

所有操作通过 tts-xiaoye.sh 完成(TTS 生成 + 渠道发送)。

快速使用

bash tts-xiaoye.sh "要转语音的文字"

多渠道用法

渠道命令格式说明
Telegramtts-xiaoye.sh "文字"MP3直接发送语音条,无需转码
飞书tts-xiaoye.sh --feishu "文字"OGG/Opus自动转码,发原生语音条
仅生成tts-xiaoye.sh --generate-only "文字"MP3只生成文件,不发送

发送飞书原生语音条(完整流程)

OPUS=$(bash tts-xiaoye.sh --feishu "飞书语音内容" 2>/dev/null | python3 -c "import sys,json; print(json.load(sys.stdin)['audio_file'])")
openclaw message send --channel feishu --account <YOUR_ACCOUNT_ID> --target <飞书用户ID> --media "$OPUS"

多 Agent 音色配置

每个 Agent 可以选择不同音色,写入各自的 SOUL.md 的 Voice Identity 节即可:

## Voice Identity
- TTS model: speech-2.8-hd
- TTS voice: Chinese (Mandarin)_Warm_Girl
- TTS script: scripts/tts-xiaoye.sh

推荐音色(已验证):

音色ID风格适用场景
Chinese (Mandarin)_Warm_Girl温暖少女个人助理
female-shaonv甜美少女默认/通用
female-tianmei甜美女性温柔风格
male-qn-qingse青涩青年男男声场景
Chinese (Mandarin)_Sweet_Lady甜美女声正式场合

可用模型

模型特点
speech-2.8-hd最高质量(推荐)
speech-2.8-turbo快速,质量略低

完整参数

tts-xiaoye.sh --text "文字" [--voice 音色ID] [--model 模型] [--caption 文字]

技术说明

  • TTS 原生输出 MP3,Telegram 直接发送(Bot API sendVoice 支持 MP3)
  • 飞书原生语音条需要 OGG/Opus 格式,通过 FFmpeg 转换(耗时约 25ms/音频,可忽略)
  • FFmpeg 安装方式:
    • macOS/Linuxbrew: brew install ffmpeg
    • Ubuntu/Debian: apt install ffmpeg

Comments

Loading comments...