Install
openclaw skills install xpilot-ad-makerGenerate a 30-second cinematic ad video with consistent character, AI narration, brand overlays, and ambient music. Uses Vidu reference-to-video for characte...
openclaw skills install xpilot-ad-makerEnd-to-end pipeline that produces a polished 30-second medical-tourism ad video.
Given a destination (e.g. "Nanning, China"), a procedure (e.g. "dental implants"), and a brand name, this skill generates a complete 30-second ad with:
reference2video so the same person shows up in every scene
without per-shot drift).af_bella voice) generates
per-shot voiceover, time-aligned to each scene.Step 1: Wavespeed (Seedream 4.5) → 1 protagonist portrait → R2
Step 2: Vidu reference2video × 4 (parallel) → 4 shot clips → R2
Step 3: Replicate Kokoro TTS × 4 → 4 narration clips
Step 4: ffmpeg concat → 30s silent video
Step 5: ffmpeg filter_complex → drawtext overlays + audio mix
Step 6: Upload final to R2
Per run (one full 30s ad):
| Item | Cost |
|---|---|
| Wavespeed Seedream 4.5 (1 portrait) | ~$0.04 |
| Vidu viduq2-pro reference2video × 4 | ~$2.50 (250 credits) |
| Replicate Kokoro TTS × 4 | ~$0.001 |
| Total | ~$2.55 |
End-to-end runtime: ~3 minutes (most time is Vidu video generation in parallel).
VIDU_API_KEY — Vidu Platform API key (https://platform.vidu.com)WAVESPEED_API_KEY — Wavespeed.ai API key (for the protagonist image)REPLICATE_API_KEY — Replicate token (for Kokoro TTS)R2_ACCOUNT_ID, R2_ACCESS_KEY_ID, R2_SECRET_ACCESS_KEY,
R2_BUCKET_NAME, R2_PUBLIC_URL — Cloudflare R2 (S3-compatible) for storagenode (≥ 18)ffmpeg is bundled via the ffmpeg-static npm package — no system install needed.# Customize the SHOTS array in make-xpilot-ad.ts with your storyboard,
# then run:
npx tsx make-xpilot-ad.ts
The script prints the final R2 URL at the end. To iterate on post-production (captions, narration, music) without re-spending Vidu credits, run:
npx tsx xpilot-ad-finalize.ts
This pulls the existing 4 video clips from R2, regenerates narration, and re-composites the final video. Free and fast (~45 seconds).
Final 30-second ad (8 MB MP4) — narration, ambient music, brand overlays: https://pub-22e3d3e3f43e400493bbd71306cae6bb.r2.dev/demo/medical-tourism-ad/v2/medtravel-final.mp4
Behind-the-scenes assets (all publicly hosted on R2):
Notice the same protagonist appears in all 4 shots — that's the power of
Vidu's reference2video mode, which this skill encapsulates.
To make this skill work for a different brand/vertical (e.g., "Mexican dental tourism", "Thai cosmetic surgery", "Korean LASIK"), edit:
REFERENCE_PROMPT — describe your protagonistSHOTS[*].prompt — describe each sceneSHOTS[*].narration — what the voiceover saysSHOTS[*].brandText — bottom brand captionSHOTS[*].topCaption — top descriptive captionThe pipeline (parallel submission, polling, R2 mirroring, ffmpeg composition) stays the same.
Vidu has three video generation modes:
| Mode | Pros | Cons |
|---|---|---|
text2video | Simple | Each shot's character looks different |
img2video | Visual continuity | Hard to change scenes (just continues motion) |
reference2video | Same character across scenes | Slightly more setup |
For multi-shot ads with a recurring protagonist, reference2video is the
only mode that works. This skill encapsulates that workflow.
; — don't URL-encode it,
that breaks the signature. Mirror to R2 immediately.% is parsed as variable — escape or use words ("60 percent")., separator —
use ; + intermediate labels instead.npm i -g tsxnpm i -g ffmpeg-static