Coze Tts

v1.0.3

Text-to-Speech (TTS) using Coze API. Convert text to natural-sounding speech audio files. Supports multiple voices and output formats (mp3, ogg_opus, wav, pcm).

0· 253·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for franklu0819-lang/coze-tts.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "Coze Tts" (franklu0819-lang/coze-tts) from ClawHub.
Skill page: https://clawhub.ai/franklu0819-lang/coze-tts
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Required env vars: COZE_API_KEY
Required binaries: jq
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install coze-tts

ClawHub CLI

Package manager switcher

npx clawhub@latest install coze-tts
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description align with the files and behavior: the script posts text to https://api.coze.cn/v1/audio/speech and saves audio. However there are minor inconsistencies: SKILL.md and references state default voice_id is 1, while the script sets VOICE_ID=6 and help text claims default 1. _meta.json version (1.0.2) differs from registry metadata (1.0.3). These look like packaging/documentation drift, not maliciousness.
Instruction Scope
SKILL.md instructs running the included shell script and only documents use of COZE_API_KEY and jq; the script's runtime actions are confined to building JSON, calling the documented Coze API endpoint, writing an audio file locally, and optionally using ffprobe. It does not attempt to read unrelated system files or other env vars.
Install Mechanism
This is an instruction-only skill with a shipped shell script and no install spec or remote downloads. Nothing is pulled from arbitrary URLs or executed during install.
Credentials
The only required env var is COZE_API_KEY which is appropriate for calling the Coze service. One minor proportionality issue: required binaries lists only jq, but the script also uses common utilities (curl, md5sum, stat, bc, date, ffprobe optional). These are typical but should be documented explicitly.
Persistence & Privilege
The skill does not request elevated or persistent platform privileges (always:false). It does not modify other skills or system-wide settings.
Assessment
This skill is coherent with its TTS purpose, but review before installing: (1) Confirm the API endpoint (https://api.coze.cn) and that you trust Coze and your API key—the script will send the text you provide to that external service. (2) Note the mismatch between documented default voice (1) and the script's VOICE_ID=6—test and adjust the default if needed. (3) The metadata/version in _meta.json differs from registry metadata; this is likely packaging drift but worth noting. (4) The skill declares jq as required but the script also expects curl, md5sum, stat, bc (and optionally ffprobe); ensure those tools exist on your system. (5) Limit the scope of the COZE_API_KEY (use least privilege / appropriate plan) and do not expose it publicly. If any of these points worry you or you need the script to behave differently, inspect or modify the script locally before use.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

Binsjq
EnvCOZE_API_KEY
latestvk97djshqx6a441tvnxhb8vaze983j2yc
253downloads
0stars
4versions
Updated 1mo ago
v1.0.3
MIT-0

Coze Text-to-Speech (TTS)

Convert text to natural-sounding speech using Coze API.

Setup

1. Get your API Key: Get a key from Coze Platform

2. Set it in your environment:

export COZE_API_KEY="your-key-here"

Supported Output Formats

  • MP3 - Default format, widely compatible
  • OGG_OPUS - Optimized for streaming and messaging
  • WAV - Uncompressed audio
  • PCM - Raw audio data

Usage

Basic TTS

Convert text to speech with default settings:

bash scripts/text_to_speech.sh "你好,这是测试语音"

Save to Specific File

bash scripts/text_to_speech.sh "你好世界" -o output.mp3

Use Different Voice

bash scripts/text_to_speech.sh "你好" -v 2

Change Output Format

bash scripts/text_to_speech.sh "你好" -f ogg_opus

Full Options

bash scripts/text_to_speech.sh "要转换的文本" -o output.mp3 -v 1 -f mp3

Parameters:

  • text (required): Text to convert to speech
  • -o, --output (optional): Output file path (default: auto-generated)
  • -v, --voice (optional): Voice ID (default: 1)
  • -f, --format (optional): Output format - mp3/ogg_opus/wav/pcm (default: mp3)

Output

The script saves the audio file and outputs:

  • File path
  • File size
  • Audio duration (if ffprobe is available)

Example output:

✓ Audio saved: coze_tts_20260324_235030_a1b2c3d4.mp3
  Size: 25.3 KB
  Duration: ~3 seconds

Workflow Examples

Generate Notification Audio

bash scripts/text_to_speech.sh "您有一条新消息" -o notification.mp3

Create Voice Greeting

bash scripts/text_to_speech.sh "欢迎使用 Coze 语音服务" -v 2 -o greeting.mp3

Generate OGG for Messaging

bash scripts/text_to_speech.sh "你好" -f ogg_opus -o message.ogg

Batch Generate

for text in "你好" "谢谢" "再见"; do
    bash scripts/text_to_speech.sh "$text" -o "${text}.mp3"
done

Integration with Other Skills

Combine with coze-asr for voice conversation:

# 1. User speaks -> ASR converts to text
bash coze-asr/scripts/speech_to_text.sh input.ogg

# 2. Process text with AI...

# 3. AI response -> TTS converts to speech
bash coze-tts/scripts/text_to_speech.sh "AI的回复" -o response.mp3

Troubleshooting

Authentication Error:

  • Check COZE_API_KEY is set correctly
  • Verify API key has TTS permissions

Invalid Voice ID:

  • Voice ID should be a number (int64 format)
  • Try voice_id: 1 as default

File Not Created:

  • Check write permissions in output directory
  • Ensure sufficient disk space

Limitations

  • Text length limits apply (check Coze documentation)
  • Rate limits may apply based on your plan
  • Some voices may not support all output formats

API Reference

  • Endpoint: POST https://api.coze.cn/v1/audio/speech
  • Authentication: Bearer token (COZE_API_KEY)
  • Content-Type: application/json

Required Environment Variables

VariableDescriptionRequired
COZE_API_KEYCoze API authentication keyYes

Required Tools

ToolPurposeRequired
jqJSON processingYes
ffprobeAudio duration detectionOptional

License

MIT

Comments

Loading comments...