Wan Image and Video Generation and Editting

v1.0.2

Image and Video Generation and Editting wiht Wan series models. It offers text2image, image editting(with prompt), text2video, image2video and reference(imag...

7· 2.6k·18 current·18 all-time
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
The skill describes image/video generation and editing with Wan models and the included script calls Dashscope endpoints (dashscope.aliyuncs.com). Declared requirements (python3 and DASHSCOPE_API_KEY) are appropriate and necessary for the functionality.
Instruction Scope
SKILL.md instructs running the bundled Python script which base64-encodes local image/video files and posts them to the Dashscope API. Reading local media files is required for image-edit / image2video tasks, but the script will accept arbitrary local paths you supply — that can lead to accidental upload/exfiltration of sensitive files if misused. The script also prints full API responses to stdout.
Install Mechanism
There is no install step or remote download. The skill ships a single Python script and relies on an existing python3 binary; this is low-risk compared to fetching/executing remote archives.
Credentials
Only one environment variable is required: DASHSCOPE_API_KEY (the declared primary credential). That is expected for a hosted-model API client. The key is sent as a Bearer token to Dashscope endpoints and thus grants the remote service the ability to consume your account and incur usage.
Persistence & Privilege
The skill does not request permanent/always-on inclusion, does not modify other skills or system-wide settings, and contains no installation hooks that would persist beyond executing the provided script.
Assessment
This skill is internally consistent and appears to do what it says: it calls Alibaba Dashscope (Wan) APIs and needs your DASHSCOPE_API_KEY. Before installing/using: (1) Confirm you trust the Dashscope endpoint and the skill author; the homepage appears to point at an Alibaba console. (2) Treat DASHSCOPE_API_KEY like a secret—do not share it and expect that calls will consume your account quota and may incur charges. (3) When supplying local file paths, only provide non-sensitive image/video files: the script will base64-encode and upload whatever path you give it, so avoid pointing it at system or private files. (4) If you want extra safety, run the script in a sandboxed environment or inspect/run the script manually to see its network activity (it posts to https://dashscope.aliyuncs.com and related regional hostnames). (5) Rotate or revoke the API key if you stop using the skill. If you want me to, I can walk through the script line-by-line or show exactly what network requests it will make.

Like a lobster shell, security has layers — review code before you run it.

latestvk97dd91ae16emcgxqr9ekh26s182s245

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Runtime requirements

🔍 Clawdis
Binspython3
EnvDASHSCOPE_API_KEY
Primary envDASHSCOPE_API_KEY

SKILL.md

Wan Models

Wan Models, created by Alibaba Group, are popular image and video generation and editting models and widely adopted around the world. This skill integrates with Wan Modles APIs on ModelStudio(Bailian-Alibaba Model Service Platform).

text2image generation

Gen images from text prompt

python3 {baseDir}/scripts/wan-magic.py text2image --prompt "一个女生站在楼顶的阳台上,夕阳照在她的脸上"
python3 {baseDir}/scripts/wan-magic.py text2image --prompt "一位长发女孩坐在书桌前,背对着镜头,戴着耳机。阳光透过窗户洒进房间,照亮了她和周围散落的书籍与杂物" --size 1280*1280
python3 {baseDir}/scripts/wan-magic.py text2image --prompt "女生优雅地倚在车门旁,身穿红色褶皱长裙,在复古色调的室内场景中缓慢转身看向镜头,霓虹光斑在玻璃窗上流动,轻微晃动,背景家具逐渐虚化凸显人物独白,画面带有电影胶片颗粒质感,港风朦胧光影映照出淡淡的忧伤情绪" --quantity 1

Options

  • --quantity: Number of images (default: 1, max: 4)
  • --prompt: User Prompt for Image Generation
  • --size: Image resolution(default:12801280,support resolutions with a width and height from 512 to 1440 pixels, provided the total pixel count does not exceed 14401440. Common resolutions:12801280,11041472,14721104,9601696,1696*960)

image2image editting

Gen images from image(image editting)

python3 {baseDir}/scripts/wan-magic.py image-edit --prompt "参考图1的风格和图2的背景,生成一张全新的图片" \
  --images 'https://cdn.wanx.aliyuncs.com/tmp/pressure/umbrella1.png' \
  'https://img.alicdn.com/imgextra/i3/O1CN01SfG4J41UYn9WNt4X1_!!6000000002530-49-tps-1696-960.webp' \
  --size "1280*1280"
  python3 {baseDir}/scripts/wan-magic.py image-edit --prompt "参考图1的风格和图2的背景,生成一张全新的图片" \
  --images '/Users/yejianhongali/workDir/pic1.png' \
  '/Users/yejianhongali/workDir/pic2.webp' 
python3 {baseDir}/scripts/wan-magic.py image-edit --prompt "参考图1的风格和图2的背景,生成一张全新的图片" --images 'https://cdn.wanx.aliyuncs.com/tmp/pressure/umbrella1.png' 'https://img.alicdn.com/imgextra/i3/O1CN01SfG4J41UYn9WNt4X1_!!6000000002530-49-tps-1696-960.webp' --quantity 1

Options

  • --quantity: Number of images (default: 1, max: 4)
  • --prompt: User Prompt for Image Editting
  • --images: Images to be editted(min: 1 image, max: 4 images).Could be image url or local image file(the wan-magic.py script will turn local image into base64 and pass to model API)
  • --size: Image resolution(default:12801280,support resolutions with a width and height from 512 to 1440 pixels, provided the total pixel count does not exceed 14401440. Common resolutions:12801280,11041472,14721104,9601696,1696*960)

text2video generation

Gen video from text prompt

text2video task-submit

python3 {baseDir}/scripts/wan-magic.py text2video-gen --prompt "一幅史诗级可爱的场景。一只小巧可爱的卡通小猫将军,身穿细节精致的金色盔甲,头戴一个稍大的头盔,勇敢地站在悬崖上。他骑着一匹虽小但英勇的战马,说:”青海长云暗雪山,孤城遥望玉门关。黄沙百战穿金甲,不破楼兰终不还。“。悬崖下方,一支由老鼠组成的、数量庞大、无穷无尽的军队正带着临时制作的武器向前冲锋。这是一个戏剧性的、大规模的战斗场景,灵感来自中国古代的战争史诗。远处的雪山上空,天空乌云密布。整体氛围是“可爱”与“霸气”的搞笑和史诗般的融合。" --duration 10 --size "1920*1080"

Options

  • --duration: duration(seconds) of video (default: 5, max: 15)
  • --prompt: User Prompt for video generation
  • --size: Image resolution(default:19201080,support any resolutions of 720p and 1080p. required:input resolution numbers(eg. 1280720) instead of 720p)

text2video tasks-get(round-robin)

python3 {baseDir}/scripts/wan-magic.py text2video-get --task-id “<TASK_ID_FROM_VIDEO_GEN>”

image2video generation

Gen video from image as the first frame

image2video task-submit

python3 {baseDir}/scripts/wan-magic.py image2video-gen --prompt "一幅都市奇幻艺术的场景。一个充满动感的涂鸦艺术角色。一个由喷漆所画成的少年,正从一面混凝土墙上活过来。他一边用极快的语速演唱一首英文rap,一边摆着一个经典的、充满活力的说唱歌手姿势。场景设定在夜晚一个充满都市感的铁路桥下。灯光来自一盏孤零零的街灯,营造出电影般的氛围,充满高能量和惊人的细节。视频的音频部分完全由他的rap构成,没有其他对话或杂音。" --image "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250925/wpimhv/rap.png" --duration 10 --resolution "720P"
python3 {baseDir}/scripts/wan-magic.py image2video-gen --prompt "一幅都市奇幻艺术的场景。一个充满动感的涂鸦艺术角色。一个由喷漆所画成的少年,正从一面混凝土墙上活过来。他一边用极快的语速演唱一首英文rap,一边摆着一个经典的、充满活力的说唱歌手姿势。场景设定在夜晚一个充满都市感的铁路桥下。灯光来自一盏孤零零的街灯,营造出电影般的氛围,充满高能量和惊人的细节。视频的音频部分完全由他的rap构成,没有其他对话或杂音。" --image "/Users/yejianhongali/workDir/rap.png" 

Options

  • --duration: duration(seconds) of video (default: 5, max: 15)
  • --prompt: User Prompt for video generation
  • --image: Image as the first frame of the to-be-gen video. Could be image url or local image file(the wan-magic.py script will turn local image into base64 and pass to model API)
  • --resolution: Video resolution(default:1080P,support 720P, 1080P. reqired: 720P or 1080P instead of numbers)

image2video tasks-get(round-robin)

python3 {baseDir}/scripts/wan-magic.py image2video-get --task-id “<TASK_ID_FROM_VIDEO_GEN>”

reference2video generation

Gen video from referenced images or/and videos

reference2video task-submit

python3 {baseDir}/scripts/wan-magic.py reference2video-gen  --prompt "character1 在海边漫步,微风吹拂头发" --reference-files "https://example.com/person.mp4"
python3 {baseDir}/scripts/wan-magic.py reference2video-gen  --prompt "character1 在咖啡厅看书" --reference-files "https://example.com/person.mp4/person.jpg" --duration 5
python3 {baseDir}/scripts/wan-magic.py reference2video-gen --prompt "Character2 坐在靠窗的椅子上,手持 character3,在 character4 旁演奏一首舒缓的美国乡村民谣。Character1 对Character2开口说道:“听起来不错”" --reference-files "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20260129/hfugmr/wan-r2v-role1.mp4" "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20260129/qigswt/wan-r2v-role2.mp4" "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20260129/qpzxps/wan-r2v-object4.png" "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20260129/wfjikw/wan-r2v-backgroud5.png" --duration 10
python3 {baseDir}/scripts/wan-magic.py reference2video-gen --prompt "character2 坐在窗边弹吉他,character1 在旁边听。character1 说:'弹得真好听。'" --reference-files "https://example.com/listener.mp4" "https://example.com/guitarist.mp4" --shot-type "multi" --duration 10 --size "1920*1080"

Options

  • --duration: duration(seconds) of video (default: 5, max: 10)
  • --prompt: User Prompt for video generation. NOTICE: Use 'character1' to refer to the first image/video of the reference-files, 'character2' to refer to the second image/video of the reference-files.
  • --reference-files: Referenced images or/and videos(reference_urls) for video generation. Usually the video generated would take the charactors/voice/scenaries as the reference. Referenced images and video must be URLs. Each URL could be an image or video. Image quantity: 05 images; Video quantity: 03 images; Image+Video quantity in total: less than 5.
  • --resolution: Video resolution(default:19201080,support any resolutions of 720P and 1080P such as: 7201280,1280720,960960,1088832,8321088,19201080,10801920,14401440,16321248,1248*1632)
  • --shot-type: shot type of the video. "single" for a continuous shot, "multi" for intelliagent multi shot for a video(default: single)

reference2video tasks-get(round-robin)

python3 {baseDir}/scripts/wan-magic.py reference2video-get --task-id “<TASK_ID_FROM_VIDEO_GEN>”

Files

7 total
Select a file
Select a file to preview.

Comments

Loading comments…