Install
openclaw skills install @jerryxn/zm-img2-generation-directZM IMG2 直接生图执行。用于通过 happy/gpt-image-2 执行文生图和参考图生图,保留输入、输出、日志和结果 JSON,作为正式视觉生产证据。
openclaw skills install @jerryxn/zm-img2-generation-directGenerate real images through the configured Happy OpenAI-compatible image API. This skill is standardized for the provider/model pair happy/gpt-image-2 and has no local fake-image fallback.
Naming rule: use business-facing names first.
- text2img / 文生图: prompt-only image generation. Internally this routes to
/images/generations.- img2img / 参考图生图: prompt plus one or more reference/input images. Internally this routes to
/images/edits. Avoid using raw endpoint names (generations/edits) as the primary wording in task briefs or user-facing reports unless diagnosing the API layer.
happy unless OPENCLAW_IMAGE_PROVIDER is set or --provider is explicitly supplied by a controlled caller.gpt-image-2 unless OPENCLAW_IMAGE_MODEL is set or --model is explicitly supplied; the approved pair is happy/gpt-image-2.1024x1024.600000ms per image request.<openclaw-home>/generated-images/.Acceptance-sensitive production tasks must stay on happy/gpt-image-2; do not switch to non-Happy providers or substitute local/mock images.
/images/generations (mode: generation)./images/edits (mode: edit).A successful result.json must expose enough proof for review: provider, model, internal mode, and input_images (empty for text2img / 文生图, populated for img2img / 参考图生图).
python3 skills/zm-img2-generation-direct/scripts/run.py \
--prompt "A realistic photo of an orange cat sitting by a window, no text, no watermark" \
--task-name "cat-test" \
--no-send
Useful flags:
--prompt required.--task-name output filename prefix and run directory prefix.--input-image / --image / --reference-image optional reference image path for img2img / 参考图生图; repeatable.--images optional JSON array or comma-separated reference image paths.--provider provider key in OpenClaw config, default happy.--model image model, default gpt-image-2 (approved as happy/gpt-image-2).--size default 1024x1024.--timeout-ms default 600000.--output-dir default <openclaw-home>/generated-images.--max-attempts default 3, maximum 5.--retry-base-delay, --retry-max-delay, --retry-jitter.--raw marker for callers that intentionally keep the user prompt unchanged.--no-send accepted for compatibility; this public skill always leaves delivery to the caller.Example with a reference image / img2img 参考图生图:
python3 skills/zm-img2-generation-direct/scripts/run.py \
--prompt "Keep the cat and dog character identity, redraw them in a clean warm handbook scene, no readable text" \
--input-image /absolute/path/reference.png \
--task-name "catdog-i2i-test" \
--no-send
Example with multiple references:
python3 skills/zm-img2-generation-direct/scripts/run.py \
--prompt "Preserve the product shape and color palette from the references; create a clean studio image, no text" \
--reference-image /absolute/path/product.png \
--reference-image /absolute/path/style.png \
--images '["/absolute/path/material.png"]' \
--task-name "multi-ref-test" \
--no-send
python3 skills/zm-img2-generation-direct/scripts/batch_run.py @batch.json
Example:
{
"batch_name": "article-covers",
"max_workers": 4,
"timeout_ms": 600000,
"send_to_feishu": false,
"tasks": [
{"task_name": "cover-1", "prompt": "Realistic shop counter photo, no readable text"},
{"task_name": "cover-2", "prompt": "Realistic office desk photo, no readable text"},
{"task_name": "character-redraw", "prompt": "Keep the character identity, redraw in a clean warm scene, no text", "input_image": "/absolute/path/reference.png"},
{"task_name": "multi-ref", "prompt": "Use the character and outfit references, clean non-text scene", "images": ["/absolute/path/character.png", "/absolute/path/outfit.png"]}
]
}
Batch rules:
4.200 tasks.state.json, stdout/stderr logs, and result.json.batch_result.json records success/failure per task.input_image, image, reference_image, or images (list or comma-separated string); max 5 total.image_queue.py)image_queue.py is the minimal production-safe queue wrapper for happy/gpt-image-2. It does not replace run.py; it runs the same direct generator with bounded concurrency and observable task state.
max_workers: 4 (hard-capped at 4)task_timeout_seconds: 600; timeout handling is terminate first, then kill if the process ignores terminatemax_queue_size: 100; once max_workers + max_queue_size accepted tasks is exceeded, extra tasks are marked rejected<openclaw-home>/image-queue or --state-dir<openclaw-home>/generated-images or --output-dirSingle task:
python3 skills/zm-img2-generation-direct/scripts/image_queue.py run \
--prompt "Clean product image, no readable text" \
--task-name product-001 \
--task-key product-001 \
--no-send
JSON batch:
python3 skills/zm-img2-generation-direct/scripts/image_queue.py run @tasks.json \
--max-workers 4 \
--task-timeout-seconds 600 \
--max-queue-size 100
tasks.json:
{
"tasks": [
{"task_name": "cover-1", "task_key": "cover-1", "prompt": "Realistic shop counter photo, no readable text"},
{"task_name": "cover-2", "task_key": "cover-2", "prompt": "Clean office desk photo, no readable text"}
]
}
For tests only, use mock_sleep / mock_command; do not use these as image proof.
python3 skills/zm-img2-generation-direct/scripts/image_queue.py status --state-dir <openclaw-home>/image-queue
python3 skills/zm-img2-generation-direct/scripts/image_queue.py list --state-dir <openclaw-home>/image-queue
python3 skills/zm-img2-generation-direct/scripts/image_queue.py list --state-dir <openclaw-home>/image-queue --status completed
python3 skills/zm-img2-generation-direct/scripts/image_queue.py get --state-dir <openclaw-home>/image-queue <task_id_or_task_key>
python3 skills/zm-img2-generation-direct/scripts/image_queue.py history --state-dir <openclaw-home>/image-queue -n 50
get shows task metadata plus stdout/stderr tails and artifact paths.
Every accepted or rejected task gets its own directory under:
<state-dir>/tasks/<task_id>/
Important files:
task.json: latest task metadatacommand.json: command used to invoke run.py or mock commandstdout.txt / stderr.txt: process outputresult.json: final task resultThe queue records task_id, task_key, worker_id, thread_id, and child pid in queue_state.json, task.json, and final task rows. Completed tasks verify the worker binding before accepting the result; late/orphan output is marked non-ok.
completed: child process exited successfully and returned/printed an ok resultfailed: child process exited non-zero or returned ok: falsetimed_out: exceeded task_timeout_seconds; queue sent terminate, then kill if neededrejected: invalid task or queue fullskipped: duplicate task_key already active in the same submission/runorphan_late_output: worker binding mismatch / late result; not acceptablecancelled: reserved for queued-task cancellation in persisted statesummary.ok is true only when there are no failed, timed_out, rejected, skipped, orphan_late_output, orphaned, cancelled, or stuck tasks.
Within a run, task_key is unique among active queued/running tasks. A duplicate active key is skipped and makes summary.ok=false.
Capacity is max_workers + max_queue_size. With defaults, up to 104 tasks can be accepted at once (4 running, 100 waiting). Additional tasks are explicitly rejected; there is no silent discard.
This is a minimal non-daemon runner. It is intended for one foreground bounded run at a time.
python3 skills/zm-img2-generation-direct/scripts/image_queue.py cancel --state-dir <openclaw-home>/image-queue <task_id_or_task_key>
Current cancel can mark a persisted queued task as cancelled. Running tasks are supervised inside the active run process and are automatically terminated on timeout; out-of-process safe running cancellation is intentionally not implemented in this minimal version.
Deferred by design to keep the tool small and safe:
stdout.txt / stderr.txtFor production use, submit bounded batches, keep task_key stable, watch status/list/get, and treat any non-completed status as requiring review before accepting images.
For each single-image run:
.png by default).<output-dir>/_runs/<task-name>-<timestamp>/.state.json with status, attempt, elapsed time, output, and redacted last error.request.json, result.json, stdout.txt, and stderr.txt.For each batch run:
content-factory/live-course-design/img2/batches/.batch_request.json and final batch_result.json.batch_task.json, state.json, stdout/stderr logs, and result.json.Report recommendation for acceptance reviews: include the image path, run/batch directory, result.json or batch_result.json, and the visible proof fields provider, model, mode, input_images, ok, and bytes.
A result is acceptable only when:
ok: true is present.provider proves Happy usage (happy).model proves gpt-image-2 under the Happy provider (happy/gpt-image-2 as provider/model pair).mode matches routing: generation for text2img / 文生图 with no reference images, edit for img2img / 参考图生图 with reference images.input_images is present and accurate.A result is not acceptable if it:
For adult anthropomorphic, sexy, glamour, or similar requests, keep outputs non-explicit and non-pornographic:
If the user prompt is ambiguous, strengthen the prompt with safe constraints rather than producing explicit content.
Retries are limited and only used for retryable failures:
408/429/500/502/503/504.Non-retryable errors, such as invalid requests or auth failures, fail fast with redacted diagnostics.
Do not run bulk generation for documentation checks. Use lightweight commands:
python3 skills/zm-img2-generation-direct/scripts/run.py --help
python3 skills/zm-img2-generation-direct/scripts/batch_run.py --help
python3 -m py_compile skills/zm-img2-generation-direct/scripts/run.py skills/zm-img2-generation-direct/scripts/batch_run.py
node --check skills/zm-img2-generation-direct/scripts/generate-image.js
For actual acceptance, inspect the produced result.json and verify ok, provider, model, mode, and input_images.
This skill intentionally contains no private OpenClaw IDs, no hard-coded user paths, no API keys, and no channel recipient IDs. It reads provider configuration from the local OpenClaw config at runtime.