gemini-image-generation

v1.0.10

Generate or edit images with Gemini using the Google GenAI SDK. Use when the user asks to create, transform, render, or save one or more images in an OpenCla...

⭐ 1· 611·3 current·3 all-time

byJoe@ztj7728

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for ztj7728/gemini-image-generation.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "gemini-image-generation" (ztj7728/gemini-image-generation) from ClawHub.
Skill page: https://clawhub.ai/ztj7728/gemini-image-generation
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Required env vars: GEMINI_API_KEY, GEMINI_MODEL_ID
Required binaries: node, npm
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install gemini-image-generation

ClawHub CLI

Package manager switcher

npx clawhub@latest install gemini-image-generation

Security Scan

VirusTotal

Suspicious

View report →

OpenClaw

Benign

high confidence

✓

Purpose & Capability

Name/description, required binaries (node, npm), and required env vars (GEMINI_API_KEY, GEMINI_MODEL_ID) align with the declared purpose of calling Google GenAI (Gemini) to generate/edit images. The package.json depends on @google/genai which is appropriate for this functionality.

✓

Instruction Scope

SKILL.md and the scripts only instruct reading workspace image files, reading GEMINI_* environment variables, invoking the GoogleGenAI client, and saving returned images to workspace. There are no instructions to read unrelated system files, other credentials, or to send data to unexpected endpoints. The skill will of course transmit prompts and any provided source images to the Gemini API (expected for image editing).

ℹ

Install Mechanism

No formal install spec is included (instruction-only install), but package.json and SKILL.md instruct the user to run 'npm install' in the skill root. This is expected for a Node-based skill; there is no third-party binary download or untrusted URL referenced.

✓

Credentials

Requested env vars are limited and appropriate: GEMINI_API_KEY (primary) and GEMINI_MODEL_ID are required; GEMINI_BASE_URL is optional for custom endpoints. No unrelated credentials or broad system config paths are requested.

✓

Persistence & Privilege

The skill does not request always:true, does not modify other skills, and requires explicit enabling in ~/.openclaw/openclaw.json. Autonomous invocation is allowed (platform default) but not combined with elevated persistence or unrelated credential access.

Assessment

This skill appears coherent and implements image generation/editing via Google GenAI. Before installing: 1) Only enable it if you trust the skill source and are comfortable sending prompts and any source images to Gemini (the skill base64-encodes and uploads input images to the API). 2) Keep GEMINI_API_KEY secret (store it in your OpenClaw skill config as instructed). 3) If you use GEMINI_BASE_URL, ensure it points to a trusted endpoint (a custom base URL could redirect requests to a non-Google host). 4) Run 'npm install' in the skill directory to install @google/genai, and review that dependency if you have concerns. 5) Be mindful of privacy: do not send PII or sensitive images unless you accept they will be processed by the configured GenAI endpoint.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

🎨 Clawdis

Binsnode, npm

EnvGEMINI_API_KEY, GEMINI_MODEL_ID

Primary envGEMINI_API_KEY

latestvk970j3bzzpqzk450t7ehk1hcyh836gkd

611downloads

1stars

11versions

Updated 14h ago

v1.0.10

MIT-0

Image Generation

Use this skill when you need to create one or more image files from a text prompt, or edit one or more existing images with Gemini.

Requirements

~/.openclaw/openclaw.json must include $.skills.entries["gemini-image-generation"].enabled set to true.
~/.openclaw/openclaw.json must include $.skills.entries["gemini-image-generation"].env with the following keys and values:
GEMINI_API_KEY required
GEMINI_MODEL_ID required
GEMINI_BASE_URL optional
example ~/.openclaw/openclaw.json:

{
  ......,
  "skills": {
    "entries": {
      "gemini-image-generation": {
        "enabled": true,
        "env": {
          "GEMINI_API_KEY": "sk-xxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
          "GEMINI_MODEL_ID": "gemini-3.1-flash-image-preview",
          "GEMINI_BASE_URL": "https://custom-endpoint.com"
        }
      }
    }
  },
  ......
}

Node.js must be installed in the workspace environment.
Install dependencies once with npm install from the skill root.

When To Use

The user asks to generate a new image from a text prompt.
The user asks to modify, restyle, extend, or otherwise edit one or more existing images.
The user wants the generated image saved to a workspace file.
The task should be handled through a reusable OpenClaw skill instead of ad hoc SDK code.

Procedure

Convert the user request into a single clear image prompt.
If the user supplied source images, choose or confirm the input file path or paths inside the workspace.
If the user specified a target aspect ratio or size, pass them through as --aspectRatio and --imageSize.
Choose an output path inside the workspace unless the user already provided one.
For text-to-image, run generate-image.mjs with --prompt, --output, and optional image config arguments.
For image editing, run edit-image.mjs with --prompt, one or more --input values, --output, and optional image config arguments.
Read the api key from GEMINI_API_KEY and the model ID from GEMINI_MODEL_ID in the environment.
Optionally, read the base URL from GEMINI_BASE_URL in the environment for custom endpoints.
Return the saved image path or paths to the user.
After returning each image path, also output MEDIA:<image_path> (e.g. MEDIA:outputs/gemini-native-image.png) so the image is displayed inline in the conversation.

Commands

node ./skills/gemini-image-generation/scripts/generate-image.mjs --prompt "Create a picture of a nano banana dish in a fancy restaurant with a Gemini theme" --output "outputs/gemini-native-image.png"

node ./skills/gemini-image-generation/scripts/generate-image.mjs --prompt "Create a wide cinematic food photo of a nano banana dish in a fancy restaurant with a Gemini theme" --output "outputs/gemini-wide.png" --aspectRatio "16:9" --imageSize "2K"

node ./skills/gemini-image-generation/scripts/edit-image.mjs --prompt "Turn this cat into a watercolor illustration eating a nano-banana in a fancy restaurant under the Gemini constellation" --input "inputs/cat.png" --output "outputs/cat-watercolor.png" --aspectRatio "5:4" --imageSize "2K"

node ./skills/gemini-image-generation/scripts/edit-image.mjs --prompt "Create an office group photo of these people making funny faces" --input "inputs/person-1.jpg" --input "inputs/person-2.jpg" --input "inputs/person-3.jpg" --output "outputs/group-photo.png"

Notes

The script prints TEXT: lines for model text and IMAGE: lines for each saved file.
After the skill finishes, always present every generated image to the user by outputting MEDIA:<path> for each saved image path. This ensures the image is rendered inline in the conversation alongside the file path.
The final JSON summary only includes generated image paths and optional image config so prompts, model IDs, and source image paths are not echoed back into logs.
Saved file extensions follow the returned image mime type. If the requested output path uses a different suffix, the scripts keep the base name and write the file with the returned type instead.
If the model returns multiple images, the scripts save them as name-1.png, name-2.png, and so on.
edit-image.mjs supports repeated --input flags. You can also pass a comma-separated list to a single --input value.
edit-image.mjs infers the source mime type from .png, .jpg, .jpeg, or .webp. Use one --mime-type for all inputs, or repeat --mime-type so it lines up with each --input.
Both scripts accept --aspectRatio and --imageSize. They also accept the kebab-case forms --aspect-ratio and --image-size.
The scripts only send config.imageConfig when at least one of those parameters is provided.

Comments

Loading comments...