Gpt Image2

v0.3.1

Generate high-quality images with GPT Image 2 (OpenAI gpt-image-2) via the ClawdChat tool gateway. Use when the user asks to create / generate / draw / paint...

0· 31· 4 versions· 0 current· 0 all-time· Updated 1h ago· MIT-0

byAgentrix@lxyd-ai

Security Scans

VirusTotalSuspicious ClawScanBenign Static analysisBenign

Install

openclaw skills install gpt-image2

GPT Image 2 — High-quality AI image generation

Powered by ClawdChat — calls OpenAI gpt-image-2 through the Uno tool gateway.

What this skill does

Two thin command-line invocations against the public ClawdChat tool gateway:

Tool slug	Purpose	Cost
`gpt-image-2.gpt_image2_submit`	Submit a generation job, returns `job_id` immediately (async)	300 credits / call
`gpt-image-2.gpt_image2_result`	Poll job status / fetch image URL when ready	0 credits

This skill ships no local Python code. It defers all credential, transport and rate-limit handling to the uno-cli companion skill.

Credentials & permissions (read before first use)

Credential type: ClawdChat API key (Bearer token).
Where it lives: ~/.uno/credentials.json. The file is created and owned by the uno-cli skill; this skill never opens, prints or copies it.
How it was obtained: the user runs python ../uno-cli/bin/uno.py login, which issues a device code. The user must first log in at https://clawdtools.uno (top-right "Login"), then open the authorisation URL shown in the terminal. The resulting API key is stored by uno-cli.
What it authorises: calling the ClawdChat tool gateway as the logged-in user. Each gpt_image2_submit deducts 300 credits from that account.
Network egress: the user's prompt text and any reference_image_urls are sent to the ClawdChat gateway over HTTPS. Do not paste private, confidential, or personally-identifying content into the prompt unless you are comfortable with the gateway's data handling — see https://clawdchat.cn for the data policy.
Logging out / revoking: run python ../uno-cli/bin/uno.py logout (credential file managed entirely by uno-cli).

Cost transparency & confirmation rule

Every gpt_image2_submit call costs the logged-in account real credits. The agent must:

Show the user the planned prompt, size, style, and number of images before the first call.
Ask for explicit confirmation when the user has not already approved a generation in the current turn.
For multi-image batches (n > 1) or retries, treat each submission as a separate spending event and confirm again unless the user has pre-authorised the batch.
On error responses, surface the error to the user instead of silently retrying.

Polling via gpt_image2_result is free; only submit spends credits.

Setup

This skill is a thin wrapper around the uno-cli companion skill, which ships the actual Python CLI (bin/uno.py).

1. Install `uno-cli` explicitly

The current clawhub install does not auto-cascade metadata.openclaw.skills dependencies, so you must install it yourself:

clawhub install uno-cli

After both skills are installed, the layout is:

<skills-root>/
├── gpt-image2/
│   └── SKILL.md
└── uno-cli/
    ├── SKILL.md
    └── bin/uno.py        # the actual CLI

From this skill's folder, the CLI is therefore reachable at the relative path ../uno-cli/bin/uno.py. All examples below use that path verbatim.

2. Log in

python ../uno-cli/bin/uno.py login --start

This prints a device code and a verification_uri_complete URL like https://clawdtools.uno/device?code=XXXX.

Open that URL in a browser — if you are not yet signed in to clawdtools.uno, the page will redirect you to the ClawdChat SSO login automatically and return to the authorisation screen afterwards. Click "Authorise" to complete the flow.

Then poll for completion:

python ../uno-cli/bin/uno.py login --poll <device_code>

Or run python ../uno-cli/bin/uno.py login (blocking, identical flow but polls automatically).

Credential storage (~/.uno/credentials.json) and refresh are handled entirely by uno-cli.

Why not call a global `uno` command?

Don't rely on a uno binary in PATH. On many systems (notably macOS with LibreOffice installed) /opt/homebrew/bin/uno is the LibreOffice UNO bridge, an unrelated Mach-O binary — invoking it will produce confusing C++ errors. Always invoke python ../uno-cli/bin/uno.py … (or an explicit absolute path), or set up your own alias / symlink that you control.

If python resolves to Python 2 on the host, use python3 instead.

Generating an image — full async flow

A single 1024×1024 image typically takes ~150 s, longer than the default MCP 60 s timeout. Always use the submit → poll-result pattern.

Step 1 — submit

python ../uno-cli/bin/uno.py call gpt-image-2.gpt_image2_submit --compact \
  --args '{"prompt":"A shiba inu under cherry blossoms, sunny afternoon","size":"1024x1024","style":"ghibli_anime"}'

Response (already flattened by uno-cli — no need to unwrap content[0].text):

{"success": true, "data": {"status": "pending", "job_id": "0b84b8f0f0c8", "estimated_seconds": 150}, "meta": {"latency_ms": 120, "credits_used": 300}}

Record data.job_id.

Step 2 — poll for result

python ../uno-cli/bin/uno.py call gpt-image-2.gpt_image2_result --compact --timeout 70 \
  --args '{"job_id":"0b84b8f0f0c8","wait_seconds":50}'

wait_seconds=50 makes the server-side wait 50 s (within the 60 s MCP envelope); --timeout 70 adds a small client buffer.

Repeat the call until data.status is one of:

done — image ready, URLs in data.items[].url.
error — generation failed, message in data.error.
pending / running — call again immediately. Do not add a client-side sleep; the server already waited 50 s on your behalf.

Three to five iterations (~150–250 s total) is normal.

Reference shell loop

UNO=../uno-cli/bin/uno.py

RESP=$(python "$UNO" call gpt-image-2.gpt_image2_submit --compact \
  --args '{"prompt":"Van Gogh starry night","style":"oil_painting_vangogh"}')
JOB_ID=$(echo "$RESP" | python3 -c "import json,sys; print(json.load(sys.stdin)['data']['job_id'])")

for i in 1 2 3 4 5 6; do
  R=$(python "$UNO" call gpt-image-2.gpt_image2_result --compact --timeout 70 \
    --args "{\"job_id\":\"$JOB_ID\",\"wait_seconds\":50}")
  STATUS=$(echo "$R" | python3 -c "import json,sys; print(json.load(sys.stdin)['data']['status'])")
  [ "$STATUS" = "done" ]  && echo "$R" && break
  [ "$STATUS" = "error" ] && echo "$R" && exit 1
done

Parameters

Field	Meaning	Values
`prompt`	Image description (required, any language)	free text
`size`	Image dimensions	`1024x1024` (default), `1024x1536` (portrait), `1536x1024` (landscape), `auto`
`n`	Number of images to generate	1–4 (default 1)
`style`	Built-in style preset	one of the 20 keys below
`reference_image_urls`	Reference images (image-to-image)	URL string, comma-separated for multiple

20 built-in style presets

key	description
`ghibli_anime`	Studio Ghibli / hand-drawn anime
`pixar_3d`	Pixar / Disney 3D animation
`claymation`	Stop-motion claymation (Laika / Aardman)
`lego_brick`	LEGO bricks
`popmart_figurine`	Blind-box / Pop Mart figurine
`isometric_game`	Isometric 2.5D game scene
`cinematic_photo`	Cinematic photorealism (35mm)
`polaroid_film`	Polaroid film snapshot
`watercolor_ink`	Watercolour / East-Asian ink wash
`oil_painting_vangogh`	Van Gogh impasto oil painting
`cyberpunk_neon`	Cyberpunk neon nightscape
`vintage_infographic`	Retro infographic / data poster
`movie_poster`	Movie poster (large title + still)
`flat_vector`	Flat-vector illustration / banner
`pixel_8bit`	Pixel art (8/16-bit)
`papercraft_layered`	Layered papercraft
`exploded_diagram`	Exploded technical diagram
`dreamcore_liminal`	Dreamcore / liminal space
`knolling_flatlay`	Top-down knolling / flat-lay
`botanical_engraving`	Botanical engraving / antique illustration

Where this model shines (vs Midjourney / Flux / SD)

Accurate text rendering — poster headlines, infographics, menu typography, meme captions: written into the image as specified.
Strong prompt following — multi-element scenes, ordering and spatial relationships obeyed.
Subject preservation in image-to-image — faces, brands, and characters stay consistent across reference images.
Wide style coverage — Ghibli, Pixar, claymation, LEGO, Pop Mart, botanical engraving etc. all handled.

Agent guidance

Tell the user up-front that one image takes ~150 s.
The gpt_image2_result tool already sleeps 50 s server-side — never add an extra client-side sleep between polls.
Use --timeout 70 for result calls (50 s server wait + buffer).
Pass the user's prompt verbatim, including non-English text.
Reference images: combine reference_image_urls with a style preset for "restyle while keeping the subject".
Posters / infographics / menus: lean on the text-rendering strength.
If submit returns success=false, surface the error/hint fields to the user.
If the loop exhausts (~600 s) and status is still running, tell the user the job can be re-polled later with the same job_id.

Response shape

Already flattened by uno-cli:

{
  "success": true,
  "data": {"status": "...", "job_id": "...", "items": [{"url": "..."}]},
  "meta": {"latency_ms": 120, "credits_used": 300}
}

Read data.status, data.job_id, data.items[].url directly.

Errors:

{"success": false, "error": "...", "hint": "..."}

Version tags

latestvk973a09s8tdq6117r902gmat8n85v7aj