ComfyUI ImageGen (Flux2)

Generate images via ComfyUI API (localhost:8188) using Flux2 workflow. Supports structured JSON prompts sent directly as positive prompt parameter, seed/steps customization. Async watcher via sub-agent for low-latency, token-efficient polling (every 5s).

MIT-0 · Free to use, modify, and redistribute. No attribution required.

⭐ 0 · 1k · 0 current installs · 0 all-time installs

by@halr9000

MIT-0

Security Scan

VirusTotal

Suspicious

View report →

OpenClaw

Benign

high confidence

✓

Purpose & Capability

Name/description match the included script and workflow: the Python script posts a Flux2 workflow JSON to a ComfyUI host, polls history, and downloads the saved JPG. There are no unexpected required env vars, binaries, or installers that would be unrelated to image generation.

ℹ

Instruction Scope

SKILL.md stays largely within image-generation scope, but the examples include orchestration steps (sessions_spawn) that automatically send images via the agent's message tool (Telegram) and use a hardcoded example workspace path (C:\Users\hal\.openclaw\workspace). Those orchestration instructions reach into the agent's messaging/channel capabilities and the local filesystem; they are plausible but are external to core generation and deserve operator awareness.

✓

Install Mechanism

No install spec is provided (instruction-only with a small included script), so nothing is downloaded or executed at install time. This minimizes supply-chain risk.

ℹ

Credentials

The skill declares no environment variables or credentials, which matches its behavior (it talks to a ComfyUI HTTP host). However SKILL.md demonstrates auto-sending images to Telegram via the agent's message tool but does not declare Telegram credentials — this relies on the agent/runtime having messaging credentials configured. The example also references a user-specific workspace path; if followed, that grants the skill read/write in that folder.

✓

Persistence & Privilege

The skill does not request permanent/always-on inclusion (always:false) and does not modify other skills or global agent settings. It uses a spawned watcher sub-agent in examples, which is normal for async jobs but not a privileged persistent presence.

Assessment

This skill appears to do what it says: submit a Flux2 workflow to a ComfyUI HTTP server (default localhost:8188), poll for completion, and download the saved image. Before installing/using: 1) ensure you trust the ComfyUI host you point it at (default is localhost; if you change --host to a remote server, your structured prompts and any metadata will be sent to that server), 2) be aware the SKILL.md examples auto-send images via the agent's message tool (Telegram) — confirm the agent's messaging channels/targets are ones you trust, 3) the example uses a user-specific workspace path for spawned jobs; modify that to a safe directory on your system if you run the watcher, and 4) the script writes downloaded images to disk and removes them in the example — check file paths and permissions you grant the agent. If you need stricter guarantees, run ComfyUI locally and avoid using the example sessions_spawn Telegram send until you verify messaging credentials and targets.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.5.0

Download zip

latestvk978b4w17mqynppzbrqwryh1hd810vzb

License

MIT-0

Free to use, modify, and redistribute. No attribution required.

Termshttps://spdx.org/licenses/MIT-0.html

SKILL.md

ComfyUI ImageGen\n\n## Changelog\n- [2026-02-11 20:42 EST]: v1.5.0 (published) - Standalone `--structured-prompt` fix (no positional arg required); workflow updated to 1920x1080 16:9; production-ready with live tests (JSON direct-to-positive-prompt, async sub-agent delivery).\n- [2026-02-11 10:10 EST]: v1.4.0 - Refactored prompting: agent converts human prompt to structured JSON string; script sends JSON directly as ComfyUI positive prompt (no prose conversion). Added text-in-image quoting rule.

Changelog

[2026-02-11 10:10 EST]: v1.4.0 - Refactored prompting: agent converts human prompt to structured JSON string; script sends JSON directly as ComfyUI positive prompt (no prose conversion). Added text-in-image quoting rule.
[2026-02-11 09:52 EST]: v1.3.0 - Script now accepts --structured-prompt JSON directly (auto-generates prose internally). Removed agent-side prose step. Updated usage/examples.
[2026-02-11 09:20 EST]: v1.2.0 - Added structured prompt parsing: translate human requests into optional JSON schema fields, auto-generate optimized Flux.2 prose prompts. Updated description.
[2026-02-11 00:15 EST]: v1.1.0 - Added --submit-only (fast prompt_id return) + --watch prompt_id modes. SKILL.md docs async flow: submit → sessions_spawn watcher sub-agent (polls every 5s, auto-sends image to Telegram, ~10x token savings vs. main-agent block).
[2026-02-11 04:05 EST]: v1.0.3 - Added Flux.2 [klein] 9B prompting best practices section with guidelines from Black Forest Labs and fal.ai.
[2026-02-10 23:05]: v1.0.2 - Workflow v3: "Image Save with Prompt/Info (WLSH)" (node 85) for improved metadata embedding. Script updated to poll node 85.
[2026-02-10 23:00]: Added explicit instruction to always send generated image to user post-generation via message tool.
[2026-02-10]: Updated to new workflow: JPG output with embedded prompt/metadata via "Image Save with Prompt File (WLSH)". Model changed to darkBeastFeb0826Latest_dbkBlitzV15.safetensors. Script now polls node 84.

Usage

Agent converts human request to mandatory structured JSON schema → compact string.
Verify ComfyUI runs on localhost:8188.

Async Mode (Recommended):

1. exec python skills/comfyui-imagegen/scripts/generate.py --structured-prompt '{"subjects":[{"description":"fluffy tabby cat","position":"center","action":"sitting relaxed"}],"scene":"cozy room interior","lighting":"warm golden sunset rays","mood":"serene and peaceful","camera":{"angle":"low angle"}}' --submit-only → parse prompt_id
2. sessions_spawn task="Set-Location 'C:\\Users\\hal\\.openclaw\\workspace'; python skills/comfyui-imagegen/scripts/generate.py --watch '{prompt_id}' --output ./gen-{prompt_id}.jpg; message action=send channel=telegram target=595846104 media=./gen-{prompt_id}.jpg; Remove-Item ./gen-{prompt_id}.jpg" label="img-{prompt_id}" cleanup=delete runTimeoutSeconds=180

Watcher polls /history/{prompt_id} every 5s (optimal: <5s latency, ~12 polls max @60s job, isolated tokens).
Auto-sends JPG to this chat on completion (sub-agent pings back).
Timeout implicitly via spawn runTimeoutSeconds=120.

Sync Mode (blocks agent):

exec python skills/comfyui-imagegen/scripts/generate.py --structured-prompt '{"scene":"your scene"}' [--seed N] [--steps 10] [--output ./my.jpg] [--host localhost:8188]
message action=send channel=telegram media=./my.jpg

Customize:

Arg Default Notes
--seed random Repro
--steps 5 20-50 quality
--host localhost:8188 Remote
--output gen-{seed/pid}.jpg Full path

Arg	Default	Notes
`--seed`	random	Repro
`--steps`	5	20-50 quality
`--host`	localhost:8188	Remote
`--output`	gen-{seed/pid}.jpg	Full path

Structured Prompt Schema (Mandatory Format)

Agent step 1: Convert human natural language request into this exact JSON structure (all fields optional; populate only relevant; subjects array supports multiples).

Rule: For text in images (signs, logos), surround in double quotes within description/action fields, e.g., "sign reading \"STOP\"" or "logo with text \"OpenClaw\""

{
  "scene": "overall scene description",
  "subjects": [
    {
      "description": "detailed subject description",
      "position": "where in frame",
      "action": "what they're doing"
    }
  ],
  "style": "artistic style",
  "color_palette": ["#FF0000", "#00AACC"],
  "lighting": "lighting description",
  "mood": "emotional tone",
  "background": "background details",
  "composition": "framing and layout",
  "camera": {
    "angle": "camera angle",
    "lens": "lens type",
    "depth_of_field": "focus behavior"
  }
}

Agent step 2: Stringify JSON (compact, single-line for shell escaping), pass to script --structured-prompt (sent directly as ComfyUI positive prompt).

Example:

User: "A cat sitting on a windowsill at sunset"

Structured JSON string (for --structured-prompt):

'{"subjects":[{"description":"fluffy tabby cat","position":"center","action":"sitting relaxed"}],"scene":"cozy room interior","lighting":"warm golden sunset rays","mood":"serene and peaceful","camera":{"angle":"low angle"}}'

Workflow Details

Polls node 85 ("Image Save with Prompt/Info (WLSH)").
Model: darkBeastFeb0826Latest_dbkBlitzV15.safetensors
Template: workflows/flux2.json

Prompting Best Practices (Flux.2 [klein] 9B)

Prose, not keywords. Subject → Scene → Lighting → Mood.
E.g., "A serene mountain lake at dawn, mist rising, golden light piercing peaks, photorealistic."
Sources: BFL, fal.ai

Examples

# Async test (structured JSON string → direct positive prompt)
python .../generate.py --structured-prompt '{"subjects":[{"description":"fluffy tabby cat","position":"center","action":"sitting relaxed"}],"scene":"cozy room interior","lighting":"warm golden sunset rays","mood":"serene and peaceful","camera":{"angle":"low angle"}}' --submit-only --steps 10
# → prompt_id=abc123; spawn watcher sub-agent

For cron alternative (less optimal): cron add one-shot at=now+10s payload.systemEvent="Check img job {prompt_id}" but spawn > cron for this.

Files

3 total

Select a file

Select a file to preview.

Comments

Loading comments…