Install
openclaw skills install video-post-productionEnd-to-end short-video post-production from one raw talking-head video: transcribe speech, build timed subtitle phrases, highlight key words, place sound eff...
openclaw skills install video-post-productionUse this skill when the user gives you one raw video and wants a finished short video with:
The default workflow is single-input automation: the user provides the video, and you handle the rest.
Required:
Optional:
If optional inputs are missing, proceed with defaults.
Create a working directory next to the input video:
<video_name>_output/Expected output files:
alignment.jsonproduction_plan.jsonsubtitles.assfinal.mp4If the user does not specify a style:
Do not block on missing BGM or SFX. A valid delivery can still be:
Confirm these are available:
ffmpegffprobepython3faster-whisper if transcription is neededUse:
ffmpeg -version | head -1
ffprobe -version | head -1
python3 -c "from faster_whisper import WhisperModel; print('faster_whisper OK')" 2>/dev/null || echo "Need: pip3 install faster-whisper"
If faster-whisper is missing:
pip3 install faster-whisper
Run:
python3 <skill-path>/scripts/align_speech.py \
--video "<input_video>" \
--output "<workdir>/alignment.json" \
--model "medium" \
--language "zh"
This produces word-level timing and segment-level timing.
Read alignment.json and produce production_plan.json.
The plan must contain:
subtitle_groupssfxbgmMinimum shape:
{
"subtitle_groups": [],
"sfx": [],
"bgm": {
"mood": "inspirational",
"tempo": "medium",
"volume": -18
}
}
Subtitle grouping:
Keyword highlighting:
SFX placement:
Do not place SFX on every subtitle. Keep it sparse enough to feel intentional.
BGM choice:
Run:
python3 <skill-path>/scripts/generate_subtitles.py \
--plan "<workdir>/production_plan.json" \
--output "<workdir>/subtitles.ass" \
--video-width 720 \
--video-height 1280 \
--font "Heiti SC" \
--font-size 48
If you need deeper ASS styling guidance, read:
<skill-path>/references/ass_effects.mdPreferred order:
If BGM exists, prepare it:
ffmpeg -stream_loop -1 -i "<bgm_file>" -t <video_duration> \
-af "volume=-18dB,afade=t=in:d=2,afade=t=out:st=<duration-3>:d=3" \
-y "<workdir>/bgm_prepared.wav"
Preferred order:
If no SFX files are available, read:
<skill-path>/references/audio_resources.mdRun:
python3 <skill-path>/scripts/render_video.py \
--input "<input_video>" \
--subtitles "<workdir>/subtitles.ass" \
--plan "<workdir>/production_plan.json" \
--bgm "<workdir>/bgm_prepared.wav" \
--sfx-dir "<workdir>/sfx" \
--output "<workdir>/final.mp4" \
--resolution "720x1280"
If no BGM exists, omit --bgm.
If no SFX directory exists, omit --sfx-dir.
The result should feel like a polished short-form knowledge video:
If the raw video is usable but supporting resources are missing:
If transcription quality is poor:
Scripts:
scripts/align_speech.pyscripts/generate_subtitles.pyscripts/render_video.pyReferences:
references/ass_effects.mdreferences/audio_resources.mdBefore finishing, ensure:
alignment.json existsproduction_plan.json existssubtitles.ass exists