macOS Voice Messages

OpenClaw provider for macOS voice message support via voicecli

Install

openclaw plugins install clawhub:macvoice

openclaw-macvoice

OpenClaw plugin for voice message support using native macOS speech APIs via voicecli.

⚠️ macOS only — This plugin requires macOS 13.0+ and uses native Apple frameworks (SFSpeechRecognizer, AVSpeechSynthesizer).

Features

  • 🎙️ Transcribe voice messages to text
  • 🔊 Respond with voice — convert text responses to audio
  • 🏠 Native macOS — uses SFSpeechRecognizer and AVSpeechSynthesizer
  • Fast — no cloud API calls, all on-device

Prerequisites

  • macOS 13.0+ (required)
  • voicecli installed:
    brew tap acwilan/voicecli
    brew install voicecli
    
  • ffmpeg (for Telegram voice compatibility):
    brew install ffmpeg
    

First-Time Setup

Before using the plugin, you need to grant macOS permissions to voicecli:

# Generate a test audio file
voicecli speak "Hello world" --voice Samantha --output /tmp/test.aiff

# Transcribe it back (this triggers the speech recognition permission prompt)
voicecli transcribe /tmp/test.aiff

# Clean up
rm /tmp/test.aiff

You should see system permission dialogs for Microphone and Speech Recognition — click Allow for both.

Installation

Install from ClawHub:

openclaw plugins install macvoice

Or install from source:

openclaw plugins install /path/to/openclaw-macvoice

Then restart the OpenClaw gateway:

openclaw gateway restart

Configuration

Add to your ~/.openclaw/openclaw.json under messages.tts:

{
  "messages": {
    "tts": {
      "auto": "inbound",
      "provider": "macvoice",
      "providers": {
        "macvoice": {
          "voice": "Samantha",
          "rate": 0.5
        }
      }
    }
  }
}

Configuration Options

OptionTypeDefaultDescription
autostring"off"When to use TTS: "off", "always", "inbound" (voice replies to voice messages), or "tagged" (only with [[tts]] tags)
providerstringSet to "macvoice"
providers.macvoice.voicestring"Samantha"Voice to use. Run voicecli voices to see available voices
providers.macvoice.ratenumber0.5Speech rate (0.0-1.0). Lower is slower
providers.macvoice.tempDirstring~/tmp/openclaw-macvoiceDirectory for temporary audio files

Available Voices

To list available voices:

voicecli voices

Common voices include:

  • Samantha (default, US English)
  • Alex (US English)
  • Karen (Australian English)
  • Daniel (British English)
  • Moira (Irish English)
  • Tessa (South African English)

Transcription (STT) Configuration

To enable automatic transcription of voice messages, add to your ~/.openclaw/openclaw.json under tools.media.audio:

{
  "tools": {
    "media": {
      "audio": {
        "enabled": true,
        "models": [
          {
            "provider": "macvoice",
            "model": "default"
          }
        ]
      }
    }
  }
}

Note: The model value can be anything (e.g., "default", "macvoice-transcribe", "local") — it's just a label. The provider: "macvoice" is what routes the request to this plugin.

With this configuration, voice messages sent to OpenClaw will be automatically transcribed using macOS native speech recognition.

Complete Configuration (TTS + STT)

For both text-to-speech and speech-to-text:

{
  "messages": {
    "tts": {
      "auto": "inbound",
      "provider": "macvoice",
      "providers": {
        "macvoice": {
          "voice": "Samantha",
          "rate": 0.5
        }
      }
    }
  },
  "tools": {
    "media": {
      "audio": {
        "enabled": true,
        "models": [
          {
            "provider": "macvoice",
            "model": "default"
          }
        ]
      }
    }
  }
}

Note: The model value under tools.media.audio.models is just a label — use any value you prefer.

Then reload the gateway:

openclaw gateway restart

Usage

Once configured, the plugin works automatically:

  • Send a voice message → OpenClaw transcribes it and can reply with voice (if auto: "inbound")
  • Send a text message → Normal text reply (unless auto: "always")

Use [[tts:text]]...[[/tts:text]] tags in your OpenClaw responses to force voice output for specific messages.

Limitations

  • Per-agent voice configuration: OpenClaw does not currently support agent-level TTS voice overrides. The voice is configured globally under messages.tts.providers.macvoice. To use different voices, use the [[tts:voice=...]] directive tag in your responses (e.g., [[tts:voice=Karen]]Hello[[/tts:text]]).

Platform Support

PlatformStatus
macOS 13.0+✅ Supported
Linux❌ Not supported
Windows❌ Not supported

License

MIT