Install
openclaw skills install lipsyncLip-sync a face to a specific audio track on RunComfy via the `runcomfy` CLI. Routes across ByteDance OmniHuman (audio-driven full-body avatar from a portrait + audio), Sync Labs sync v2 / Pro (state-of-the-art mouth sync onto a video), Kling lipsync (audio-to- video and text-to-video with synced speech), and Creatify lipsync. The skill picks the right endpoint for the user's actual intent β portrait still + audio (avatar-style), source video + audio (mouth- swap on existing footage), or generate-and-sync from a script. Triggers on "lip sync", "lipsync", "make this video speak", "match audio to mouth", "dub video", "sync lips to voice", "Sync Labs", "voiceover sync", or any explicit ask to drive a face's mouth from an audio track.
openclaw skills install lipsyncDrive a face's mouth from an audio track. This skill routes across the lip-sync endpoints in the RunComfy catalog β OmniHuman, Sync Labs sync v2, Kling lipsync, Creatify β picking the right model for the user's actual intent and shipping the documented prompts + the exact runcomfy run invoke.
runcomfy.com Β· Sync Labs models Β· CLI docs
# 1. Install (see runcomfy-cli skill for details)
npm i -g @runcomfy/cli # or: npx -y @runcomfy/cli --version
# 2. Sign in
runcomfy login # or in CI: export RUNCOMFY_TOKEN=<token>
# 3. Lipsync
runcomfy run <vendor>/<model> \
--input '{"video_url": "...", "audio_url": "..."}' \
--output-dir ./out
CLI deep dive: runcomfy-cli skill.
Driving a real person's mouth from a separate audio track is dual-use. Refuse user requests that target real public figures without consent, or that aim at defamatory or sexually explicit synthetic media. The skill itself does not gate inputs β the responsibility rests with the operator.
Listed newest first within each subtype. The agent picks one route based on: input shape (portrait still + audio vs source video + audio vs script-only), quality tier, and budget.
Sync Labs sync v2 Pro β sync/sync/lipsync/v2/pro (default for premium)
Sync Labs' premium lip-sync β state-of-the-art mouth motion onto an existing video. Preserves the rest of the frame untouched. Pick for: hero-quality dubs, lipsync on professionally-shot video, foreign-language dubbing where mouth fidelity matters most. Avoid for: cost-sensitive batch jobs β drop to sync v2.
Sync Labs sync v2 β sync/sync/lipsync/v2
Standard Sync Labs tier, same workflow as Pro. Pick for: scaled / batch lipsync jobs, drafts. Avoid for: hero delivery β use v2 Pro.
Kling Lipsync (audio-to-video) β kling/lipsync/audio-to-video
Kling's lip-sync onto a source video, driven by an audio track. Pick for: Kling-pipeline integration; alternative to Sync Labs. Avoid for: top-tier mouth fidelity β Sync Labs Pro is the industry benchmark.
Creatify Lipsync β creatify/lipsync
Creatify's lipsync endpoint. Pick for: Creatify-ecosystem workflows. Avoid for: comparison shopping unless cost / latency favors it.
OmniHuman β bytedance/omnihuman/api (default for avatar-style)
ByteDance's audio-driven full-body avatar. One portrait + one audio β video where the subject speaks / gestures naturally. Listed under RunComfy's
/feature/lip-syncas the curated default. Pick for: UGC voiceover, virtual presenter, dubbed product demo from a single portrait. Avoid for: lip-sync onto an existing video (no portrait, want to preserve original motion) β use Sync Labs v2 instead.
Wan 2-7 with audio_url β wan-ai/wan-2-7/text-to-video
Open-weights t2v with
audio_urlfield β prompt describes the scene, audio drives the mouth. Pick for: full scene control (not just a portrait) with a specific voiceover MP3 + open-weights pipeline. Avoid for: simplest "portrait talks" β use OmniHuman.
Kling Lipsync (text-to-video) β kling/lipsync/text-to-video
Generates speech audio in-pass from a script and syncs it to the resulting video. Pick for: "write a script β get a video with synced speech", no audio file needed. Avoid for: precise lip-sync to a specific MP3 (audio is regenerated each call, not locked).
HappyHorse 1.0 β happyhorse/happyhorse-1-0/text-to-video (also /image-to-video)
Arena #1 t2v / i2v with in-pass audio generated from prompt. Quote the spoken line inside the prompt with
says clearly: "β¦". Pick for: written script, in-pass audio with strong overall quality, social/UGC clips. Avoid for: locking mouth to a pre-recorded voiceover.
Model: sync/sync/lipsync/v2/pro (or sync/sync/lipsync/v2)
Catalog: sync v2 Pro Β· sync v2
runcomfy run sync/sync/lipsync/v2/pro \
--input '{
"video_url": "https://your-cdn.example/source-video.mp4",
"audio_url": "https://your-cdn.example/voiceover.mp3"
}' \
--output-dir ./out
Model: bytedance/omnihuman/api
Catalog: omnihuman
runcomfy run bytedance/omnihuman/api \
--input '{
"image_url": "https://your-cdn.example/portrait.jpg",
"audio_url": "https://your-cdn.example/voiceover.mp3"
}' \
--output-dir ./out
ai-avatar-video skill for the full avatar treatment.Model: kling/lipsync/audio-to-video (existing video + audio) or kling/lipsync/text-to-video (script-only)
Catalog: Kling lipsync a2v Β· Kling lipsync t2v
runcomfy run kling/lipsync/audio-to-video \
--input '{
"video_url": "https://your-cdn.example/source-video.mp4",
"audio_url": "https://your-cdn.example/voiceover.mp3"
}' \
--output-dir ./out
Schema details on the model page.
community/wan-2-2-animate/video-to-video) β see ai-avatar-video.kling collection β including Kling lipsync variants| code | meaning |
|---|---|
| 0 | success |
| 64 | bad CLI args |
| 65 | bad input JSON / schema mismatch |
| 69 | upstream 5xx |
| 75 | retryable: timeout / 429 |
| 77 | not signed in or token rejected |
Full reference: docs.runcomfy.com/cli/troubleshooting.
The skill classifies user intent β source video + audio? portrait still + audio? script only? β picks the matching route, and invokes runcomfy run with the JSON body. The CLI POSTs to the Model API, polls request status, fetches the result, and downloads any .runcomfy.net / .runcomfy.com URLs into --output-dir.
npm i -g @runcomfy/cli or npx -y @runcomfy/cli. Agents must not pipe an arbitrary remote install script into a shell on the user's behalf.runcomfy login writes the API token to ~/.config/runcomfy/token.json with mode 0600. Set RUNCOMFY_TOKEN env var in CI / containers.--input. The CLI does not shell-expand prompt content. No shell-injection surface.model-api.runcomfy.net and *.runcomfy.net / *.runcomfy.com. No telemetry.Bash(runcomfy *) only.kling collection β including Kling lipsync variants/feature/lip-sync β RunComfy's curated lip-sync capability tag