Fish Audio Speech

Fish Audio speech provider for OpenClaw with high-quality TTS, voice cloning, configurable voices, and voice-note friendly output for Telegram and WhatsApp.

Audits

Pass

Install

openclaw plugins install clawhub:@conan-scott/openclaw-fish-audio

Fish Audio Speech — OpenClaw Plugin

Fish Audio TTS plugin for OpenClaw, with high-quality voice cloning, Telegram/WhatsApp voice replies, and access to 1M+ voices via Fish Audio's voice library. Supports S2-Pro and S1 models.

Features

  • Voice cloning — use any Fish Audio voice (your own clones or community voices)
  • S2-Pro & S1 models — latest Fish Audio TTS models
  • Format-aware output — opus for voice notes (Telegram, WhatsApp), mp3 otherwise
  • Inline directives — control voice, speed, model, latency, and sampling per-message
  • Bundled agent skill — teaches agents to write Fish-friendly voice text and expressive markers
  • Voice listing — browse your cloned voices and popular community voices via /voice list

Installation

openclaw plugins install @conan-scott/openclaw-fish-audio

Then restart OpenClaw.

Getting an API Key

  1. Sign up at fish.audio
  2. Go to AccountAPI KeysCreate API Key
  3. Create a revocable key with the minimum access you need
  4. Copy the key for configuration below

Configuration

Prefer setting the API key as an environment variable or secret:

FISH_AUDIO_API_KEY=your-fish-audio-api-key

Then add the provider configuration to your openclaw.json:

{
  messages: {
    tts: {
      provider: "fish-audio",
      providers: {
        "fish-audio": {
          voiceId: "reference-id-of-your-voice",
          model: "s2-pro",       // s2-pro (default) | s1
          latency: "normal",     // normal (default) | balanced | low
          // speed: 1.0,         // 0.5–2.0 (optional)
          // temperature: 0.7,   // 0–1 (optional)
          // topP: 0.8,          // 0–1 (optional)
        },
      },
    },
  },
}

You can also set apiKey directly under messages.tts.providers.fish-audio, but secret-backed configuration is safer for shared systems and published examples.

Only set baseUrl for a Fish Audio-compatible endpoint you trust. The plugin sends the Fish Audio API key to that endpoint; custom URLs must use HTTPS except for localhost development.

Finding a Voice

Use the /voice list command in OpenClaw to browse available voices. The plugin shows:

  1. Your cloned/trained voices (all pages, via self=true)
  2. Popular community voices (top-ranked by score) as a fallback for new users

You can also browse voices at fish.audio and copy the voice ID from the URL.

Use cloned, trained, or community voices only when you have the rights, consent, and authorization to use that voice.

Inline Directives

All directive keys are provider-prefixed to avoid collisions with other speech providers. Both fishaudio_* and shorter fish_* aliases work.

[[tts:fishaudio_voice=<ref_id>]]         Switch voice
[[tts:fishaudio_speed=1.2]]              Prosody speed (0.5–2.0)
[[tts:fishaudio_model=s1]]               Model override
[[tts:fishaudio_latency=low]]            Latency mode
[[tts:fishaudio_temperature=0.7]]        Sampling temperature (0–1)
[[tts:fishaudio_top_p=0.8]]              Top-p sampling (0–1)

Short aliases: fish_voice, fish_speed, fish_model, fish_latency, fish_temperature, fish_top_p.

Expressive Markers

Fish Audio understands natural expressive markers in the text itself, such as (laughs) or (sighs). OpenClaw does not parse or transform these markers; the plugin passes text verbatim to Fish Audio's /v1/tts API. Round-bracket markers are confirmed working. Square-bracket marker syntax is unverified.

For agent-authored voice messages, avoid Markdown stage directions such as *laughs*; some TTS paths may read the asterisks literally. This package includes a fish-audio-tts AgentSkill so OpenClaw agents can learn the preferred plain-text style automatically.

Models

ModelDescription
s2-proLatest high-quality model (default)
s1Previous generation, lighter weight

Latency Modes

ModeDescription
normalBest quality, higher latency (default)
balancedBalance between quality and speed
lowFastest response, may reduce quality

Troubleshooting

  • No voice configured: Set voiceId in config. Fish Audio has no universal default voice.
  • Empty voice list: New users with no cloned voices will see popular community voices as a starting point.
  • API key missing: Set either apiKey in config or FISH_AUDIO_API_KEY env var.

License

MIT