UGC Manual

v1.0.2

Generate lip-sync video from image + user's own audio recording. ✅ USE WHEN: - User provides their OWN audio file (voice recording) - Want to sync image to specific audio/voice - User recorded the script themselves - Need exact audio timing preserved ❌ DON'T USE WHEN: - User provides text script (not audio) → use veed-ugc - Need AI to generate the voice → use veed-ugc - Don't have audio file yet → use veed-ugc with script INPUT: Image + audio file (user's recording) OUTPUT: MP4 video with lip-sync to provided audio KEY DIFFERENCE: veed-ugc = script → AI voice → video ugc-manual = user audio → video (no voice generation)

2· 987·1 current·1 all-time
byPaul de Lavallaz@pauldelavallaz

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for pauldelavallaz/ugc-manual.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "UGC Manual" (pauldelavallaz/ugc-manual) from ClawHub.
Skill page: https://clawhub.ai/pauldelavallaz/ugc-manual
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Canonical install target

openclaw skills install pauldelavallaz/ugc-manual

ClawHub CLI

Package manager switcher

npx clawhub@latest install ugc-manual
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
The script implements exactly what the description says: it uploads an image and converted audio to ComfyDeploy, queues a specific deployment, polls for completion, and downloads the MP4. Requiring an API key for ComfyDeploy (COMFY_DEPLOY_API_KEY) is coherent with this purpose. However, the registry metadata claims 'Required env vars: none' and 'Required binaries: none' while the code requires COMFY_DEPLOY_API_KEY and ffmpeg; that metadata mismatch is unexpected.
Instruction Scope
Runtime instructions and code stay within the expected scope: they may download audio from a provided URL, convert audio locally with ffmpeg, upload files to https://api.comfydeploy.com, queue a workflow using a fixed deployment ID, poll status, and download the resulting video. The agent is instructed to send user-provided image/audio to a third-party service (ComfyDeploy) — this is necessary for the stated workflow but is an important privacy/transfer decision and should be made explicit to users.
Install Mechanism
There is no install spec (instruction-only with included scripts). A pyproject.toml lists only the 'requests' dependency. No third-party binary download or obscure URL extraction is used. Running the script executes local ffmpeg and Python requests calls; the install surface is low but the script will perform network I/O at runtime.
!
Credentials
The code requires a single environment variable COMFY_DEPLOY_API_KEY (used to authenticate file uploads and workflow queueing), which is proportionate to the functionality. However, the skill's declared requirements in the registry do not list this credential (metadata states 'none'), creating an inconsistency. The missing declaration makes it harder for users to know what secrets they must provide and trust.
Persistence & Privilege
The skill is not configured as always: true and does not request persistent/system-level privileges. It only uses one environment variable and does not modify other skills or global agent configuration.
What to consider before installing
Key things to check before installing or running: - Metadata mismatch: the package metadata claims no required env vars/binaries, but the script requires COMFY_DEPLOY_API_KEY and ffmpeg. Expect to supply an API key and have ffmpeg available. - Data privacy: the script uploads user-provided images and audio to https://api.comfydeploy.com and queues a fixed deployment ID. If your audio or image is sensitive or private, do not use this skill unless you trust ComfyDeploy and understand their retention/privacy policy. - Verify API key provenance: only provide COMFY_DEPLOY_API_KEY if you obtained it from a trusted ComfyDeploy account; otherwise the key could be misused to upload/queue jobs under your account. - Source provenance: the skill lists no homepage and the owner is not human-readable. If provenance is important, request the upstream source or inspect the repo before use. - Testing suggestion: run the script in a sandbox or with non-sensitive sample media first. Confirm network endpoints, deployment ID, and resulting behavior match expectations. - If you need offline or local-only processing (no upload), do not use this skill — it is designed to use ComfyDeploy and will transmit media off your machine. If you want, I can extract the exact places the code will contact the network and show the minimal set of commands/requests it will make, or help you modify the script to use a different endpoint or to run locally if available.

Like a lobster shell, security has layers — review code before you run it.

latestvk97dc7hc362vc77s9wpdzdrwj9811407
987downloads
2stars
2versions
Updated 1mo ago
v1.0.2
MIT-0

UGC-Manual

Generate lip-sync videos by combining an image with a custom audio file using ComfyDeploy's UGC-MANUAL workflow.

Overview

UGC-Manual takes:

  1. An image (person/character with visible face)
  2. An audio file (user's voice recording)

And produces a video where the person in the image lip-syncs to the audio.

API Details

Endpoint: https://api.comfydeploy.com/api/run/deployment/queue Deployment ID: 075ce7d3-81a6-4e3e-ab0e-7a25edf601b5

Required Inputs

InputDescriptionFormats
imageImage with a visible faceJPG, PNG
input_audioAudio file to lip-syncMP3, WAV, OGG

Usage

uv run ~/.clawdbot/skills/ugc-manual/scripts/generate.py \
  --image "path/to/image.jpg" \
  --audio "path/to/audio.mp3" \
  --output "output-video.mp4"

With URLs:

uv run ~/.clawdbot/skills/ugc-manual/scripts/generate.py \
  --image "https://example.com/image.jpg" \
  --audio "https://example.com/audio.mp3" \
  --output "result.mp4"

Workflow Integration

Typical Use Cases

  1. Custom voice recordings - User records their own audio via Telegram/WhatsApp
  2. Pre-generated TTS - Audio generated externally (ElevenLabs, etc.)
  3. Music/sound sync - Sync mouth movements to any audio

Example Pipeline

# 1. Convert Telegram voice message to MP3 (if needed)
ffmpeg -i voice.ogg -acodec libmp3lame -q:a 2 voice.mp3

# 2. Generate lip-sync video
uv run ugc-manual... --image face.jpg --audio voice.mp3 --output video.mp4

Difference from VEED-UGC

FeatureUGC-ManualVEED-UGC
Audio sourceUser providesGenerated from brief
ScriptN/AAuto-generated
VoiceUser's recordingElevenLabs TTS
Use caseCustom audioAutomated content

Notes

  • Image should have a clearly visible face (frontal or 3/4 view)
  • Audio quality affects output quality
  • Processing time: ~2-5 minutes depending on audio length
  • Audio auto-conversion: The script automatically converts any audio format (MP3, OGG, M4A, etc.) to WAV PCM 16-bit mono 48kHz before sending to FabricLipsync
  • Requires ffmpeg installed on the system

Comments

Loading comments...