Gemini Tts

PassAudited by VirusTotal on May 11, 2026.

Overview

Type: OpenClaw Skill Name: gemini-tts Version: 1.0.0 The skill is a legitimate implementation of a Text-to-Speech (TTS) generator using the Google Gemini API. The script `generate_voice.py` correctly handles environment variables for authentication, makes standard API calls to the official Google endpoint, and processes the resulting audio data without any signs of malicious intent, data exfiltration, or obfuscation.

Findings (0)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

Note

ASI03: Identity and Privilege Abuse

What this means

You must provide a Gemini API key, and the skill can spend or use quota on that Gemini account when you run it.

Why it was flagged

The script reads a Gemini API key from the environment to call the provider. This is expected for the stated Gemini TTS purpose, but it is not declared in the registry requirements.

Skill content

api_key = os.environ.get("GEMINI_API_KEY")

Recommendation

Use a dedicated, limited Gemini API key if possible and monitor provider usage; the skill metadata should declare GEMINI_API_KEY as a required credential.

Note

ASI07: Insecure Inter-Agent Communication

What this means

Any text you ask it to speak is sent to Google Gemini for processing.

Why it was flagged

The script sends the provided text to Google's Gemini API to generate audio. This external provider call is central to the skill, but it means the text leaves the local environment.

Skill content

url = f"https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-preview-tts:generateContent?key={api_key}"

Recommendation

Do not use this skill with confidential text unless you are comfortable with Gemini API processing and applicable data-retention terms.

Note

ASI09: Human-Agent Trust Exploitation

What this means

You may think you can select a custom persona voice, but the script appears to use the same hardcoded Gemini voice each time.

Why it was flagged

The CLI accepts a voice/persona argument, but the Gemini request hardcodes the prebuilt voice to `Puck`, so the advertised custom voice/persona behavior is not actually reflected in the implementation.

Skill content

parser.add_argument("--voice", default="little-claw-persona") ... "voice_name": "Puck"

Recommendation

Treat the current voice customization claim as limited unless the code is updated to use the requested voice parameter.