Gemini Tts

PassAudited by ClawScan on May 1, 2026.

Overview

This appears to be a straightforward Gemini text-to-speech skill, but users should know it sends their text to Google Gemini and requires a Gemini API key.

This skill looks benign and purpose-aligned. Before installing, make sure you are comfortable providing a Gemini API key and sending the text you want spoken to Google Gemini. Also note that the current code appears to ignore the requested voice/persona and uses a hardcoded `Puck` voice.

Findings (3)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

Low

#ASI03: Identity and Privilege Abuse

What this means

You must provide a Gemini API key, and the skill can spend or use quota on that Gemini account when you run it.

Why it was flagged

The script reads a Gemini API key from the environment to call the provider. This is expected for the stated Gemini TTS purpose, but it is not declared in the registry requirements.

Skill content

api_key = os.environ.get("GEMINI_API_KEY")

Recommendation

Use a dedicated, limited Gemini API key if possible and monitor provider usage; the skill metadata should declare GEMINI_API_KEY as a required credential.

Low

#ASI07: Insecure Inter-Agent Communication

What this means

Any text you ask it to speak is sent to Google Gemini for processing.

Why it was flagged

The script sends the provided text to Google's Gemini API to generate audio. This external provider call is central to the skill, but it means the text leaves the local environment.

Skill content

url = f"https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-preview-tts:generateContent?key={api_key}"

Recommendation

Do not use this skill with confidential text unless you are comfortable with Gemini API processing and applicable data-retention terms.

Info

#ASI09: Human-Agent Trust Exploitation

What this means

You may think you can select a custom persona voice, but the script appears to use the same hardcoded Gemini voice each time.

Why it was flagged

The CLI accepts a voice/persona argument, but the Gemini request hardcodes the prebuilt voice to `Puck`, so the advertised custom voice/persona behavior is not actually reflected in the implementation.

Skill content

parser.add_argument("--voice", default="little-claw-persona") ... "voice_name": "Puck"

Recommendation

Treat the current voice customization claim as limited unless the code is updated to use the requested voice parameter.