Wuli Skill

v1.0.7

Generate AI images and videos with 17+ active models via Wuli.art open platform API. Use when creating images from text prompts, editing images with one or m...

⭐ 1· 488·0 current·0 all-time

byzihao chu@sir1st-inc

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for sir1st-inc/wuli.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "Wuli Skill" (sir1st-inc/wuli) from ClawHub.
Skill page: https://clawhub.ai/sir1st-inc/wuli
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Required env vars: WULI_API_TOKEN
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install wuli

ClawHub CLI

Package manager switcher

npx clawhub@latest install wuli

Security Scan

Capability signals

Requires OAuth tokenRequires sensitive credentials

These labels describe what authority the skill may exercise. They are separate from suspicious or malicious moderation verdicts.

VirusTotal

Benign

View report →

OpenClaw

Benign

high confidence

✓

Purpose & Capability

Name/description match the declared requirements: only a WULI_API_TOKEN is required and python3 is listed as the runtime. The token is the expected credential for calling the Wuli.art platform API.

ℹ

Instruction Scope

SKILL.md and skill.py instruct the agent to upload local files and to download remote media and re-upload it to Wuli OSS (using pre-signed URLs returned by the platform). This is expected for a media-generation skill, but means local files and any remote URLs you supply will be transmitted to the third-party service.

✓

Install Mechanism

No install spec; the skill is instruction + a single Python script and uses only standard library modules. No external downloads or suspicious installers are present.

✓

Credentials

Only the WULI_API_TOKEN is required (declared as primaryEnv). No unrelated secrets, config paths, or extra environment variables are requested.

✓

Persistence & Privilege

always is false and the skill does not request permanent platform-level privileges. It does open downloaded result files with the OS viewer (subprocess/os.startfile/xdg-open), which is normal for a CLI utility but worth noting.

Assessment

This skill appears to do what it says: it calls Wuli.art APIs, uploads any local files you give it, and may download remote URLs you provide before re-uploading them to Wuli's OSS. Before installing/using: (1) only set WULI_API_TOKEN obtained from the official wuli.art site; (2) do not upload sensitive or private media (uploads go to the third party and API calls consume credits); (3) avoid passing untrusted remote URLs (the skill will fetch them); (4) review the included skill.py if you want to confirm behavior, and run the skill in an isolated environment if you are concerned about handling unknown media or automatically opening downloaded files.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

🎨 Clawdis

OSLinux · macOS · Windows

Any binpython3

EnvWULI_API_TOKEN

Primary envWULI_API_TOKEN

latestvk97e4h3tnn5x1zn9r5gpra3xxx85ngwy

488downloads

1stars

8versions

Updated 19h ago

v1.0.7

MIT-0

Linux, macOS, Windows

Wuli Platform — AI Image & Video Generation

Generate AI images and videos via the Wuli.art open platform API. Supports text-to-image, multi-reference image editing, text-to-video, first-frame image-to-video, first-last-frame video, auto-video, and sound control on supported video models across 17+ active models including Qwen Image, Seedream, 通义万相, 可灵, Seedance, and MiniMax Hailuo, with automatic uploads and no-watermark downloads when available.

When to Use

User wants to generate images from a text description
User wants to edit or transform one or more existing images with AI
User wants to create video from a text prompt
User wants to animate a static image into a video
User wants to generate a video from a first frame and last frame
User wants to generate a video from reference images and/or a reference video
User needs high-resolution (up to 4K) AI artwork
User wants batch image generation
User needs to choose between multiple AI models for best results

Setup

export WULI_API_TOKEN="wuli-your-token-here"

Get your token: log in to wuli.art, click the API entry at the bottom-left corner.

No additional dependencies — uses only Python 3 standard library.

Quick Start

# Generate an image (simplest usage)
python3 skill.py --action image-gen --prompt "a serene mountain lake at sunrise"

# Generate a video
python3 skill.py --action txt2video --prompt "waves crashing on a golden beach at sunset"

# First-last-frame video
python3 skill.py --action flf2video --prompt "camera pushes through the scene"   --image_path ./start.jpg --end_image_path ./end.jpg

# Auto-video from a reference video
python3 skill.py --action auto-video --prompt "preserve the motion, make it cinematic"   --video_path ./input.mp4 --model "通义万相 2.6"

Complete Command Reference

python3 skill.py --action <action> --prompt "description" [options]

Actions

Action	Description	Reference Inputs
`image-gen`	Text to image	None
`image-edit`	Edit or transform one or more reference images	`--image_url` or `--image_path`
`txt2video`	Text to video	None
`image2video`	Animate one image into a video	Exactly one `--image_*`
`flf2video`	First-last-frame video generation	Start frame via `--image_`, end frame via `--end_image_`
`auto-video`	Auto-video from reference images and/or videos	`--image_`, `--video_`, or both

Parameters

Parameter	Default	Description
`--prompt`	(required)	Generation prompt, max 2000 chars
`--model`	auto-selected	Model name (see Model Selection Guide)
`--aspect_ratio`	1:1 (image) / 16:9 (video)	Aspect ratio
`--resolution`	2K (image) / 720P (video)	Output resolution. Model-dependent values include `2K`, `3K`, `4K`, `480P`, `720P`, `768P`, `1080P`
`--n`	1	Number of images to generate (1-4, image-gen only)
`--image_url`	—	Reference image URL(s). Supports comma-separated multiple URLs
`--image_path`	—	Local reference image path(s). Supports comma-separated multiple files
`--end_image_url`	—	End-frame image URL for `flf2video`
`--end_image_path`	—	End-frame local image path for `flf2video`
`--video_url`	—	Reference video URL(s) for `auto-video`, comma-separated
`--video_path`	—	Local reference video path(s) for `auto-video`, comma-separated
`--duration`	5	Video length in seconds
`--negative_prompt`	—	Exclude unwanted elements
`--sound`	backend default	Enable sound for supported video models. If omitted, backend default behavior is used
`--no-sound`	—	Disable sound for supported video models
`--optimize`	true	Prompt optimization is enabled by default and recommended for most prompts
`--no-optimize`	—	Disable prompt optimization when you need fully raw prompt behavior

Input Rules

image-edit accepts one or more reference images
image2video requires exactly one reference image
flf2video requires exactly two images: start frame + end frame
auto-video accepts reference images, reference videos, or both
--sound / --no-sound only affect video tasks, and only take effect on models that support sound control
Local files and remote URLs are auto-uploaded to Wuli OSS before submission

Aspect Ratios

Ratio	Use Case
`1:1`	Square — social media posts, avatars, icons
`16:9`	Widescreen — videos, desktop wallpapers, presentations
`9:16`	Vertical — phone wallpapers, stories, reels
`4:3`	Classic — photos, prints
`3:2`	Photography — DSLR-style landscape shots
`21:9`	Ultra-wide — cinematic banners (image only on supported models)

Model Selection Guide

Image Models — Which to Choose

Model	Best For	Resolution	Ref Images	Cost
通义万相 2.7	Newest Wan image model, up to 9 ref images	2K, 4K	9	1 credit (2K) / 3 credits (4K)
Qwen Image 2.0 (default)	General purpose, fast, versatile	2K, 4K	4	1 credit
Qwen Image Turbo	Quick drafts, iterations	2K, 4K	4	1 credit
Seedream 5.0 Lite	Higher detail at lower cost than premium Seedream	2K, 3K	8	4 credits
Seedream 4.5	Photorealism, high-fidelity detail	2K, 4K	8	4 credits
Seedream 4.0	Photorealism with broad API-side support	2K, 4K	8	4 credits

Recommendations:

Fastest & cheapest: Qwen Image Turbo
Best all-rounder: Qwen Image 2.0
Best detail at mid tier: Seedream 5.0 Lite
Best quality for photos: Seedream 4.5
Most ref images (up to 9): 通义万相 2.7
Need 4K: 通义万相 2.7, Qwen Image 2.0, Qwen Image Turbo, or Seedream 4.5

Video Models — Which to Choose

Model	Best For	Resolution	Duration	Key Modes
通义万相 2.2 Turbo (default)	Quick videos, low cost	720P	5s	TXT, FF
通义万相 2.7	Newest Wan video model with stronger narrative	720P-1080P	5-15s (AUTO 5-10)	TXT, FF, FLF, AUTO
通义万相 2.6 Flash	Fast image-to-video	720P-1080P	5-15s	FF
通义万相 2.6	Best all-rounder with auto-video	720P-1080P	5-15s	TXT, FF, AUTO
Happy Horse 1.0	Long-duration TXT/FF/AUTO video, premium quality	720P-1080P	5-15s	TXT, FF, AUTO
可灵 3.0 Omni	Richest multi-input video workflow	720P-1080P	5-15s	TXT, FF, FLF, AUTO
可灵 O1	Premium omni video quality	720P-1080P	5-10s	TXT, FF, FLF, AUTO
可灵 3.0	High-quality first-last-frame video	720P-1080P	5-15s	TXT, FF, FLF
可灵 2.6	1080P-focused Kling generation	1080P	5-10s	TXT, FF, FLF
可灵 2.5 Turbo	Lower-cost Kling first-last-frame workflows	1080P	5-10s	TXT, FF, FLF
Seedance 1.5 Pro	Motion-heavy videos, up to 12s	480P-720P	5-12s	TXT, FF, FLF
Seedance 1.0 Pro	Broad resolution coverage	480P-1080P	5-10s	TXT, FF, FLF
MiniMax Hailuo 2.3	Cinematic text/video generation	768P-1080P	6-10s	TXT, FF
MiniMax Hailuo 2.3 Fast	Faster image-to-video	768P-1080P	6-10s	FF

Recommendations:

Fastest & cheapest: 通义万相 2.2 Turbo
Best all-rounder: 通义万相 2.7 or 通义万相 2.6
Best multi-input / auto-video: 可灵 3.0 Omni or 通义万相 2.7
Best premium quality: Happy Horse 1.0, 可灵 O1, or 可灵 3.0
Best for first-last-frame: 通义万相 2.7, 可灵 3.0, 可灵 2.6, or Seedance 1.5 Pro
Best for 1080P-only workflows: 可灵 2.6 or 可灵 2.5 Turbo

Examples

Text to Image

# Simple generation
python3 skill.py --action image-gen --prompt "anime girl with blue hair in a garden"

# Batch generate 4 images to pick the best
python3 skill.py --action image-gen --prompt "cyberpunk cityscape at night" --n 4

# Photorealistic with premium model
python3 skill.py --action image-gen --prompt "photorealistic mountain landscape, golden hour"   --model "Seedream 4.5" --resolution 4K --aspect_ratio 16:9

# Higher-detail model
python3 skill.py --action image-gen --prompt "luxury product photography, studio lighting"   --model "Seedream 5.0 Lite" --resolution 3K

# Prompt optimization is enabled by default
python3 skill.py --action image-gen --prompt "a cat"

# Disable optimization only when you need exact raw prompting
python3 skill.py --action image-gen --prompt "a cat" --no-optimize

Image Editing

# Edit a local image
python3 skill.py --action image-edit --prompt "add sunglasses and a hat"   --image_path ./photo.jpg

# Edit with multiple reference images
python3 skill.py --action image-edit --prompt "merge these references into one unified brand illustration"   --image_path "./ref1.jpg,./ref2.jpg"

# Edit a remote image
python3 skill.py --action image-edit --prompt "change background to sunset beach"   --image_url "https://example.com/photo.jpg"

# Style transfer
python3 skill.py --action image-edit --prompt "convert to oil painting style"   --image_path ./landscape.jpg --model "Seedream 4.5"

Text to Video

# Simple video
python3 skill.py --action txt2video --prompt "waves crashing on a golden beach at sunset"

# Longer duration with higher quality
python3 skill.py --action txt2video --prompt "a cat playing piano in a jazz club"   --model "通义万相 2.6" --duration 10 --resolution 1080P

# Explicitly enable sound on a supported model
python3 skill.py --action txt2video --prompt "a singer performing on a neon stage"   --model "通义万相 2.6" --duration 10 --sound

# Cinematic quality
python3 skill.py --action txt2video --prompt "slow-motion rain drops on a window"   --model "可灵 O1" --resolution 1080P --aspect_ratio 16:9

# Disable sound when you want silent output or a lower-credit tier on some models
python3 skill.py --action txt2video --prompt "slow aerial shot over a mountain lake"   --model "可灵 3.0 Omni" --duration 10 --no-sound

Image to Video (Animate)

# Animate a landscape photo
python3 skill.py --action image2video --prompt "slow zoom in with gentle wind blowing"   --image_path ./landscape.jpg --duration 5

# Animate from URL with premium model
python3 skill.py --action image2video --prompt "character turns head and smiles"   --image_url "https://example.com/portrait.jpg" --model "可灵 O1"

First-Last-Frame Video

# Use start and end frame to control motion
python3 skill.py --action flf2video --prompt "a smooth cinematic transition between these two shots"   --image_path ./start.jpg --end_image_path ./end.jpg --model "可灵 3.0"

# Mix local and remote references
python3 skill.py --action flf2video --prompt "the flower blooms naturally from frame one to frame two"   --image_path ./flower_closed.jpg   --end_image_url "https://example.com/flower_open.jpg"

Auto-Video

# Auto-video from a reference video
python3 skill.py --action auto-video --prompt "keep the action, upgrade it to a cinematic sci-fi look"   --video_path ./input.mp4 --model "通义万相 2.6" --duration 10 --sound

# Auto-video from multiple reference images
python3 skill.py --action auto-video --prompt "turn these keyframes into a coherent product reveal video"   --image_path "./frame1.jpg,./frame2.jpg,./frame3.jpg"   --model "可灵 3.0 Omni" --resolution 1080P

# Auto-video with both images and video
python3 skill.py --action auto-video --prompt "preserve the motion, but restyle everything as an anime sequence"   --image_url "https://example.com/style_ref.jpg"   --video_url "https://example.com/source.mp4"   --model "可灵 O1"

Workflow

1. (Optional) Upload reference images/videos
   --image_path ./photo.jpg             → auto-uploaded to cloud storage
   --image_url https://...              → auto-downloaded and re-uploaded
   --video_path ./clip.mp4              → auto-uploaded to cloud storage
   --image_path "./a.jpg,./b.jpg"       → uploads multiple image refs

2. Submit generation task
   → Returns recordId for tracking

3. Auto-poll for completion
   → Images: polls every 5s (up to 5 min)
   → Videos: polls every 10s (up to 20 min)

4. Auto-download results
   → Fetches no-watermark version when available
   → Saves to current directory
   → Auto-opens on macOS/Linux/Windows

Troubleshooting

"WULI_API_TOKEN not set": Run export WULI_API_TOKEN="wuli-your-token". Get your token from wuli.art bottom-left corner → API. The token is sent as Authorization: Bearer <token>.
"HTTP 401": Token is invalid or expired. Regenerate it on the wuli.art platform.
"HTTP 429": Rate limited. Wait a few seconds and retry.
"Error code 2001": Insufficient credits. Top up at wuli.art.
"REVIEW_FAILED": Content moderation rejected the prompt. Rephrase to avoid sensitive content.
"TIMEOUT": Generation took too long. Try a faster model or shorter duration.
"image2video requires exactly one reference image": Use only one --image_* input for image2video.
"flf2video requires exactly two reference images": Pass a start frame with --image_* and an end frame with --end_image_*, or provide exactly two image references total.
"auto-video requires at least one reference image or video": Provide --image_*, --video_*, or both.
Media upload fails: Supported image formats are jpg, jpeg, png, webp. Supported video formats are mp4, mov, avi, webm.
Sound flag seems ignored: --sound / --no-sound only affect video models that expose sound control. Some models or modes may keep backend default behavior.

Tips

Prompt optimization is enabled by default and is recommended for most prompts, especially short or vague ones.
Use --no-optimize only when you need the model to follow your raw prompt as directly as possible.
Start with the default models (Qwen Image 2.0 / 通义万相 2.2 Turbo) when you want the cheapest and simplest path.
Generate multiple images with --n 4 and pick the best.
Use --negative_prompt to exclude unwanted elements, e.g. --negative_prompt "blurry, low quality, watermark".
For video, start with 5-second duration to preview, then re-generate at longer duration once you're happy with the style.
If a video model supports sound control, --no-sound can be useful for silent exports and may reduce credits on some models.
For flf2video, keep the start and end frames visually consistent for smoother motion.
For auto-video, start with one video or 2-3 images before scaling up to more references.
All results are auto-downloaded without watermarks when available.

For complete API documentation including current model tables and upload flow, see references/【呜哩Wuli】开放平台 API 文档.md.

Comments

Loading comments...