Install
openclaw skills install chatgpt-image-genGenerate images using ChatGPT/DALL-E through OpenClaw browser automation. Use when the user wants to create images via ChatGPT's web interface with their log...
openclaw skills install chatgpt-image-genGenerate images using ChatGPT's DALL-E integration through OpenClaw browser automation.
Chrome Extension Installation:
Initial Setup (one-time):
This skill uses OpenClaw's built-in browser tool with Chrome extension relay (profile="chrome") to control an already-logged-in ChatGPT tab. This bypasses ChatGPT's bot detection because it uses your real browser session.
IMPORTANT: There is NO browser act subcommand. Each action is a direct subcommand.
| Action | CLI Syntax |
|---|---|
| List tabs | openclaw browser tabs |
| Snapshot | openclaw browser snapshot --target-id <ID> |
| Click | openclaw browser click <ref> --target-id <ID> |
| Type | openclaw browser type <ref> "<text>" --target-id <ID> |
| Press key | openclaw browser press <key> --target-id <ID> |
| Navigate | openclaw browser navigate <url> --target-id <ID> |
| Screenshot | openclaw browser screenshot --target-id <ID> |
<ref> and <text> are positional arguments (no --ref flag)--target-id accepts a full ID or unique prefix (e.g. 77CB instead of 77CB8A574E8A44861C5FE49388EF6ABC)--profile is a parent option on openclaw browser, not on subcommandsopenclaw browser tabs
Look for a tab with URL containing chatgpt.com. Note the targetId.
openclaw browser snapshot --target-id <ID> --format ai --efficient
This outputs a tree with refs like e23, e589, etc. Always run snapshot before interacting.
openclaw browser click e23 --target-id <ID>
openclaw browser type e589 "Generate an image: a futuristic city at sunset" --target-id <ID>
Add --submit to press Enter after typing:
openclaw browser type e589 "Generate an image: a cat riding a skateboard" --target-id <ID> --submit
openclaw browser press Enter --target-id <ID>
Use sleep to wait for DALL-E to generate (30-60 seconds):
sleep 45
Then take a new snapshot to check the result:
openclaw browser snapshot --target-id <ID> --format ai --efficient
# 1. List tabs, find the ChatGPT tab targetId
openclaw browser tabs
# 2. Take snapshot to find element refs
openclaw browser snapshot --target-id 4535E --format ai --efficient
# 3. Click input field (check ref from snapshot, usually labeled "Ask anything")
openclaw browser click e589 --target-id 4535E
# 4. Type prompt and submit
openclaw browser type e589 "Generate an image: a futuristic city at sunset" --target-id 4535E --submit
# 5. Wait for DALL-E generation
sleep 45
# 6. Take new snapshot to see result and find download button
openclaw browser snapshot --target-id 4535E --format ai --efficient
# 7. Click download button (ref from new snapshot)
openclaw browser click e745 --target-id 4535E
"Can't reach the OpenClaw browser control service":
openclaw gateway restart"Chrome extension relay is running, but no tab is connected":
"ref is required" error:
snapshot first to get the refsCommand not found / Unknown command:
browser act — use direct subcommands: browser click, browser type, browser pressbrowser click e23, NOT browser click --ref e23Image generation timeout:
sleep 45 then re-snapshot to checkBot detection / Login issues:
tabs first--submit: The type command supports --submit to press Enter automatically--target-id accepts a unique prefix, no need for the full 32-char IDThis approach uses your actual Chrome browser session, so it inherits all your ChatGPT permissions and settings. No credentials are stored or transmitted - everything happens in your existing browser session.