Gemini STT
AdvisoryAudited by Static analysis on Apr 30, 2026.
Overview
No suspicious patterns detected.
Findings (0)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
If the agent is tricked into using an unsafe Vertex region value, a Google Cloud access token and the audio being transcribed could be sent to a non-Google host.
The user-controlled region value is inserted into the URL host while a Google bearer token is attached. A malicious or manipulated region containing a slash could change the request destination and receive the token and audio payload.
url = f"https://{region}-aiplatform.googleapis.com/v1/projects/{project}/locations/{region}/publishers/google/models/{model}:generateContent"
...
"Authorization": f"Bearer {access_token}",Validate region against an allowlist or strict region-name regex before constructing the URL, and refuse values containing URL metacharacters such as '/', ':', '@', '?', or '#'. Require confirmation for non-default endpoints.
The skill may use your Google API key or active Google Cloud account/project, which can affect billing and access permissions.
The skill uses either a Gemini API key from the environment or a local gcloud access token. This is expected for the stated Google transcription integration, but it is sensitive credential use.
api_key = os.environ.get("GEMINI_API_KEY")
...
["gcloud", "auth", "print-access-token"]Use a least-privileged API key or Google Cloud project, verify the active gcloud account and project before running, and declare these credential requirements in the skill metadata.
Audio contents you transcribe are sent to Google for processing.
The selected local audio file is read, base64-encoded, and included in a request to Google's Gemini or Vertex AI API. This is purpose-aligned but crosses a privacy boundary.
with open(file_path, "rb") as f:
audio_data = f.read()
...
{"inline_data": {"mime_type": mime_type, "data": b64_data}}Only use this skill for audio you are comfortable sending to Google, and review Google's Gemini or Vertex AI data handling terms for your account type.
