Install
openclaw skills install comfyui-image-videoGenerate images and videos via ComfyUI on local GPU. Supports Flux text-to-image, Wan2.1 text-to-video, and image-to-video.
openclaw skills install comfyui-image-videoUse to generate images (Flux schnell) and videos (Wan2.1 T2V/I2V) on the local RTX 5080 GPU.
~/ComfyUI (systemd user service: comfyui.service)~/comfyui-venvhttp://127.0.0.1:8188~/ComfyUI/output/{baseDir}/scripts/generate.py <mode> [options]
image — Text-to-Image (Flux schnell){baseDir}/scripts/generate.py image \
--prompt "A cat on the moon" \
--output /tmp/output.png
| Option | Default | Description |
|---|---|---|
--prompt | (required) | Text prompt |
--negative | "" | Negative prompt |
--width | 1024 | Image width |
--height | 1024 | Image height |
--steps | 4 | Sampling steps (schnell optimized) |
--seed | random | Reproducible seed |
--output | ComfyUI output dir | Copy output here |
--model | flux1-schnell.safetensors | UNET filename |
--weight-dtype | fp8_e4m3fn | Weight quantization |
--wait | 120 | Max wait seconds |
Recommended Flux schnell params: steps=4, cfg=1.0, sampler=euler, scheduler=simple
t2v — Text-to-Video (Wan2.1 T2V-1.3B){baseDir}/scripts/generate.py t2v \
--prompt "A red sports car driving on a mountain road at sunset" \
--length 49 \
--output /tmp/video_frames/
| Option | Default | Description |
|---|---|---|
--prompt | (required) | Text prompt |
--negative | "" | Negative prompt |
--width | 832 | Frame width |
--height | 480 | Frame height |
--length | 49 | Number of frames (≈3s at 16fps) |
--steps | 20 | Sampling steps |
--seed | random | Reproducible seed |
--output | ComfyUI output dir | Copy frames here |
--wait | 300 | Max wait seconds |
Recommended Wan2.1 T2V params: steps=20, cfg=5.0, sampler=uni_pc_bh2, scheduler=simple
i2v — Image-to-Video (Wan2.1 I2V using T2V-1.3B){baseDir}/scripts/generate.py i2v \
--prompt "gentle wave motion, water flowing" \
--image /path/to/input.png \
--output /tmp/video_frames/
| Option | Default | Description |
|---|---|---|
--prompt | (required) | Motion description |
--image | (required) | Path to input image |
--length | 49 | Number of frames |
--steps | 20 | Sampling steps |
--seed | random | Reproducible seed |
--output | ComfyUI output dir | Copy frames here |
--wait | 300 | Max wait seconds |
# Start (systemd user service)
systemctl --user start comfyui.service
# Check status
systemctl --user status comfyui.service
# Check API
curl -s http://127.0.0.1:8188/system_stats | python3 -m json.tool
# Manual start (if systemd not available)
cd ~/ComfyUI && LD_LIBRARY_PATH=~/comfyui-venv/lib/python3.12/site-packages/nvidia/cuda_runtime/lib:$LD_LIBRARY_PATH ~/comfyui-venv/bin/python main.py --listen 127.0.0.1 --port 8188
| File | Location | Size |
|---|---|---|
| flux1-schnell.safetensors | models/unet/ | 23.8GB |
| ae.safetensors | models/vae/ | 335MB |
| clip_l.safetensors | models/clip/ | 250MB |
| t5xxl_fp16.safetensors | models/clip/ | 9.8GB |
| File | Location | Size |
|---|---|---|
| wan2.1_t2v_1.3B_bf16.safetensors | models/diffusion_models/ | 5.3GB |
| wan2.1_vae.pth | models/vae/ | 485MB |
| umt5_xxl_fp8_e4m3fn_scaled.safetensors | models/text_encoders/ | 6.1GB |
| open_clip_xlm_roberta_large_vit_huge_14.pth | models/clip/ | 4.5GB (for I2V) |
curl http://127.0.0.1:8188/system_stats).systemctl --user start comfyui.service).generate.py with appropriate mode and options.xdg-open to view.imageio.libcudart.so not found: set LD_LIBRARY_PATH with nvidia/cuda_runtime/lib.--length for video.pkill -f "main.py" for stale processes.