WaveSpeedAI MiniMax Speech 2.6 TTS

v1.0.0

Convert text to speech using MiniMax Speech 2.6 Turbo via WaveSpeed AI. Features ultra-human voice cloning, sub-250ms latency, 40+ languages, emotion control...

0· 292·0 current·0 all-time
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
The name/description (text-to-speech via MiniMax Speech 2.6) matches the SKILL.md instructions: example code calls wavespeed.run, parameters are TTS-related, and there is no unrelated functionality. However the SKILL.md references an external 'wavespeed' client library (npm) which is not declared in the registry metadata; otherwise capability is coherent.
Instruction Scope
Runtime instructions stay within TTS scope: they show how to set an API key, call the WaveSpeed model, configure voice/emotion/format, and handle errors. The instructions do not ask the agent to read unrelated local files or system secrets beyond the API key, nor do they instruct exfiltration to other endpoints. They do call external network services (the WaveSpeed API), which is expected for a TTS integration.
Install Mechanism
This is an instruction-only skill with no install spec and no code files (low install surface). However, the examples import a 'wavespeed' npm package but the skill does not declare that dependency or provide installation instructions — the author may have assumed the package exists. The lack of a declared install/source for that package and unknown source/homepage increases risk: verify the package origin before running code that imports it.
!
Credentials
SKILL.md explicitly instructs users to set WAVESPEED_API_KEY, but the registry metadata lists no required environment variables and no primary credential. This mismatch means the skill silently requires a secret to function even though metadata doesn't declare it. The requested secret (an API key for the external TTS service) is plausible for the skill's purpose, but the missing declaration and unknown publisher/homepage are concerning and should be corrected or explained.
Persistence & Privilege
The skill does not request always:true, does not modify system or other skills, and is instruction-only so it does not install persistent components. Default autonomous invocation is allowed (normal for skills) but there are no indications of privileged persistence.
What to consider before installing
Before installing or using this skill: (1) Confirm the publisher and the legitimacy of wavespeed.ai (or the package author) — the registry metadata has no homepage/source. (2) Ask the publisher to update metadata to declare the required WAVESPEED_API_KEY and any package dependencies, so you know what secrets/dependencies are needed. (3) Inspect or verify the 'wavespeed' npm package (or other client) on the official registry and prefer installing from the official package name and a trusted source. (4) If you must provide an API key, consider creating a limited-scope/test key, not a long-lived production key, and avoid sending sensitive PII through the service. (5) Run in an isolated environment or sandbox first and monitor network calls. (6) If the publisher cannot explain the metadata mismatch or provide a trustworthy source, treat the skill as untrusted.

Like a lobster shell, security has layers — review code before you run it.

latestvk97cg7br4rqcerw8vxssnq7gj58261t9
292downloads
0stars
1versions
Updated 1mo ago
v1.0.0
MIT-0

WaveSpeedAI MiniMax Speech 2.6 Turbo

Convert text to speech using MiniMax Speech 2.6 Turbo via the WaveSpeed AI platform. Features ultra-human voice cloning, sub-250ms latency, 40+ language support, and emotion control.

Authentication

export WAVESPEED_API_KEY="your-api-key"

Get your API key at wavespeed.ai/accesskey.

Quick Start

import wavespeed from 'wavespeed';

const output_url = (await wavespeed.run(
  "minimax/speech-2.6-turbo",
  {
    text: "Hello, welcome to WaveSpeed AI!",
    voice_id: "English_CalmWoman"
  }
))["outputs"][0];

API Endpoint

Model ID: minimax/speech-2.6-turbo

Convert text to speech with configurable voice, emotion, speed, pitch, and audio format.

Parameters

ParameterTypeRequiredDefaultDescription
textstringYes--Text to convert to speech. Max 10,000 characters. Use <#x#> between words to insert pauses (0.01-99.99 seconds).
voice_idstringYes--Voice preset ID. See Voice IDs below.
speednumberNo1Speech speed. Range: 0.50-2.00
volumenumberNo1Speech volume. Range: 0.10-10.00
pitchnumberNo0Speech pitch. Range: -12 to 12
emotionstringNohappyEmotional tone. One of: happy, sad, angry, fearful, disgusted, surprised, neutral
english_normalizationbooleanNofalseImprove English number reading normalization
sample_rateintegerNo--Sample rate in Hz. One of: 8000, 16000, 22050, 24000, 32000, 44100
bitrateintegerNo--Bitrate in bps. One of: 32000, 64000, 128000, 256000
channelstringNo--Audio channels. 1 (mono) or 2 (stereo)
formatstringNo--Output format. One of: mp3, wav, pcm, flac
language_booststringNo--Enhance recognition for a specific language. See Supported Languages.

Example

import wavespeed from 'wavespeed';

const output_url = (await wavespeed.run(
  "minimax/speech-2.6-turbo",
  {
    text: "The quick brown fox jumps over the lazy dog.",
    voice_id: "English_expressive_narrator",
    speed: 1.0,
    pitch: 0,
    emotion: "neutral",
    format: "mp3",
    sample_rate: 24000,
    bitrate: 128000
  }
))["outputs"][0];

Pause Control

Insert pauses in speech using <#x#> syntax where x is seconds (0.01-99.99):

const output_url = (await wavespeed.run(
  "minimax/speech-2.6-turbo",
  {
    text: "And the winner is <#2.0#> WaveSpeed AI!",
    voice_id: "English_CaptivatingStoryteller"
  }
))["outputs"][0];

Advanced Usage

Sync Mode

const output_url = (await wavespeed.run(
  "minimax/speech-2.6-turbo",
  {
    text: "Hello world!",
    voice_id: "English_CalmWoman"
  },
  { enableSyncMode: true }
))["outputs"][0];

Custom Client with Retry Configuration

import { Client } from 'wavespeed';

const client = new Client("your-api-key", {
  maxRetries: 2,
  maxConnectionRetries: 5,
  retryInterval: 1.0,
});

const output_url = (await client.run(
  "minimax/speech-2.6-turbo",
  {
    text: "Welcome to our platform.",
    voice_id: "English_Trustworth_Man"
  }
))["outputs"][0];

Error Handling with runNoThrow

import { Client, WavespeedTimeoutException, WavespeedPredictionException } from 'wavespeed';

const client = new Client();
const result = await client.runNoThrow(
  "minimax/speech-2.6-turbo",
  {
    text: "Testing speech generation.",
    voice_id: "English_CalmWoman"
  }
);

if (result.outputs) {
  console.log("Audio URL:", result.outputs[0]);
  console.log("Task ID:", result.detail.taskId);
} else {
  console.log("Failed:", result.detail.error.message);
  if (result.detail.error instanceof WavespeedTimeoutException) {
    console.log("Request timed out - try increasing timeout");
  } else if (result.detail.error instanceof WavespeedPredictionException) {
    console.log("Prediction failed");
  }
}

Voice IDs

English Voices (Popular)

Voice IDDescription
English_CalmWomanCalm female voice
English_Trustworth_ManTrustworthy male voice
English_expressive_narratorExpressive narrator
English_radiant_girlRadiant girl voice
English_magnetic_voiced_manMagnetic male voice
English_CaptivatingStorytellerStoryteller voice
English_Upbeat_WomanUpbeat female voice
English_GentleTeacherGentle teacher voice
English_PlayfulGirlPlayful girl voice
English_ManWithDeepVoiceDeep male voice
English_ConfidentWomanConfident female voice
English_ComedianComedic voice
English_SereneWomanSerene female voice
English_WiseScholarScholarly voice
English_Cute_GirlCute girl voice
English_Sharp_CommentatorSharp commentator
English_Lucky_RobotRobot voice

General Voices

Wise_Woman, Friendly_Person, Inspirational_girl, Deep_Voice_Man, Calm_Woman, Casual_Guy, Lively_Girl, Patient_Man, Young_Knight, Determined_Man, Lovely_Girl, Decent_Boy, Imposing_Manner, Elegant_Man, Abbess, Sweet_Girl_2, Exuberant_Girl

Special Voices

whisper_man, whisper_woman_1, angry_pirate_1, massive_kind_troll, movie_trailer_deep, peace_and_ease

Other Languages

Voices are available for: Chinese (Mandarin), Cantonese, Arabic, Russian, Spanish, French, Portuguese, German, Turkish, Dutch, Ukrainian, Vietnamese, Indonesian, Japanese, Italian, Korean, Thai, Polish, Romanian, Greek, Czech, Finnish, Hindi, Bulgarian, Danish, Hebrew, Malay, Persian, Slovak, Swedish, Croatian, Filipino, Hungarian, Norwegian, Slovenian, Catalan, Nynorsk, Tamil, Afrikaans.

Voice IDs follow the pattern {Language}_{VoiceName} (e.g., Japanese_KindLady, Korean_SweetGirl, French_CasualMan).

Supported Languages

For language_boost: Chinese, Chinese,Yue, English, Arabic, Russian, Spanish, French, Portuguese, German, Turkish, Dutch, Ukrainian, Vietnamese, Indonesian, Japanese, Italian, Korean, Thai, Polish, Romanian, Greek, Czech, Finnish, Hindi, Bulgarian, Danish, Hebrew, Malay, Persian, Slovak, Swedish, Croatian, Filipino, Hungarian, Norwegian, Slovenian, Catalan, Nynorsk, Tamil, Afrikaans

Pricing

$0.06 per 1,000 characters.

Security Constraints

  • API key security: Store your WAVESPEED_API_KEY securely. Do not hardcode it in source files or commit it to version control. Use environment variables or secret management systems.
  • Input validation: Only pass parameters documented above. Validate text content before sending requests.

Comments

Loading comments...