Gemini Image CLI
v0.1.0Generate and edit images with a bundled Gemini native image-generation CLI. Use when the user asks Codex to create images with Gemini, use Gemini image gener...
Like a lobster shell, security has layers — review code before you run it.
Gemini Image CLI
Use ./scripts/gemini-image.sh for Gemini native image generation. Prefer this bundled script over writing one-off curl commands.
Workflow
- Run
./scripts/gemini-image.shwith the user's prompt and any requested options. - Do not ask which endpoint to use for ordinary requests. The script auto-selects the provider: local Gemini-compatible proxy first, then Google fallback.
- Keep default settings for ordinary single-image generation:
gemini-3.1-flash-image-preview, size512, aspect16:9. - Use
gemini-2.5-flash-imagewhen latency matters more than latest image quality. - Use
gemini-3-pro-image-previewwhen the user needs stronger instruction following, text rendering, or professional-quality output. - Confirm before multi-model batches, many retries, or other repeated calls that may consume extra quota.
- Read
references/behavior.mdonly when explaining provider/security tradeoffs, choosing non-default models, configuring a local Gemini-compatible proxy, troubleshooting slow or failed requests, or modifying the CLI.
Common Commands
Generate one image:
./scripts/gemini-image.sh "A cute orange kitten sitting on a soft blanket"
Generate with an explicit output path or prefix. The script chooses the final extension from the returned image MIME type:
./scripts/gemini-image.sh "画两只小猫在打闹" --output ./out/kittens.png
Use a faster model:
./scripts/gemini-image.sh "画两只小猫在打闹" --model gemini-2.5-flash-image
Force Google official endpoint:
./scripts/gemini-image.sh "画两只小猫在打闹" --provider google
Force local proxy endpoint:
./scripts/gemini-image.sh "画两只小猫在打闹" --provider local
Use a larger output size:
./scripts/gemini-image.sh "A cinematic poster of two kittens" --size 1K --aspect 16:9
Use an input image for image-guided generation or editing:
./scripts/gemini-image.sh "Turn this cat photo into a watercolor illustration" --image cat.jpg
Output Contract
The script prints human-readable logs to stderr and machine-readable results to stdout.
Successful stdout lines:
image=<path>
raw_json=<path>
text=<path>
duration_seconds=<seconds>
text= appears only when --with-text is enabled.
Safety
Do not expose full Google Gemini API keys in conversation or source files. Prefer the local proxy mode when the runtime should not have access to the real Google key.
The script masks keys in curl logs and redacts input-image base64 from printed request bodies.
Do not enable retries automatically for ambiguous multi-request tasks. Retries can submit additional generation requests and may incur additional cost.
Comments
Loading comments...
