{"skill":{"slug":"llm-video-generator","displayName":"llm-video-generator","summary":"Generate videos from text descriptions using ZhipuAI CogVideoX-3 model. Supports text-to-video, image-to-video, and first/last frame-to-video generation. Aut...","description":"---\nname: llm-video-generator\ndescription: >\n  Generate videos from text descriptions using ZhipuAI CogVideoX-3 model.\n  Supports text-to-video, image-to-video, and first/last frame-to-video generation.\n  Automatically handles long videos (over 5s) by chaining multiple generation calls\n  with last-frame continuation. Use when the user asks to create/generate a video\n  from text, make a video, text-to-video, 文生视频, 生成视频, 做个视频, or any\n  request involving converting text/images into a video. Supports configuring video\n  content, style, resolution (up to 4K), frame rate (30/60fps), audio, and duration.\n---\n\n# LLM Video Generator\n\nGenerate videos via ZhipuAI CogVideoX-3. Each API call produces ~5s of video.\nFor longer videos, chain multiple calls using last-frame continuation, then concatenate.\n\n## Scripts\n\nAll scripts use `/opt/anaconda3/bin/python3`. Resolve `<skill-dir>` to this skill's directory.\n\n| Script | Purpose |\n|--------|---------|\n| `scripts/video_gen.py` | Core generation (3 modes: text2video, image2video, frames2video) |\n| `scripts/extract_last_frame.py` | Extract last frame from a video (for continuation) |\n| `scripts/concat_videos.py` | Concatenate multiple video segments into one |\n\n## Workflow\n\n### Step 1: Assess Request & Clarify\n\n**Clear request** → proceed to Step 2. A request is clear when:\n- Video content/scene is described with enough detail\n- Style or visual tone is specified or implied\n- Duration is stated (default: 5s if not specified)\n\n**Vague request** → propose a plan first:\n\n```\n基于你的需求，我拟定了以下视频方案：\n\n📹 **视频内容**: [detailed scene description with key moments]\n🎨 **视频风格**: [e.g., 写实/动画/电影感/温馨...]\n⏱️ **视频时长**: [Xs, note: will be generated in 5s segments]\n🔊 **背景音乐**: 有/无\n📐 **分辨率**: 1920x1080\n🎞️ **帧率**: 30fps\n\n你觉得这个方案可以吗？需要调整哪些部分？\n```\n\nIterate with the user until confirmed.\n\n### Step 2: Estimate Time & Notify User\n\nBefore starting generation, calculate and report the estimated time:\n\n**Time estimation formula:**\n- Base: 1 minute per second of video (e.g., 20s video ≈ 20 minutes)\n- High-definition (4K or 60fps): add +30% (e.g., 20s 4K video ≈ 26 minutes)\n- Additional overhead: ~2 minutes for frame extraction, concatenation, and compression\n- Segments: ceil(target_duration / 5)\n\n**MUST send this message to the user before starting generation:**\n\n```\n⏳ **视频生成预估**\n\n📊 分段计划：{N} 段（每段约5秒）\n⏱️ 预计总耗时：约 {estimated_minutes} 分钟\n📐 分辨率：{resolution}\n\n视频生成是一个耗时过程，请耐心等待。我会在每段完成后实时汇报进度。\n```\n\nExample for a 30s 1080P video:\n- 6 segments, base time = 30 minutes, +2 min overhead → ~32 minutes\n- Message: \"预计总耗时：约 32 分钟\"\n\nExample for a 20s 4K video:\n- 4 segments, base time = 20 * 1.3 = 26 min, +2 min → ~28 minutes\n\n### Step 3: Plan Generation Segments\n\nEach API call produces ~5 seconds. Calculate segments: `ceil(target_duration / 5)`\n\nFor multi-segment videos, plan how the content evolves across segments. Write a prompt for each segment describing what happens in that 5-second window, maintaining visual continuity.\n\n### Step 4: Execute Generation with Progress Reports\n\n**CRITICAL: After each segment completes, IMMEDIATELY send a progress message to the user before starting the next segment.** Do not wait until all segments are done.\n\n**Progress message format (send via message tool or inline reply after each segment):**\n\n```\n✅ 进度：{completed}/{total} 段完成（第{N}段已生成）\n📝 内容：{brief segment description}\n⏱️ 本段耗时：{minutes}分钟\n📊 预计剩余：约 {remaining_minutes} 分钟\n```\n\n**Generation process:**\n\n**Segment 1 — Text-to-Video:**\n\n```bash\n/opt/anaconda3/bin/python3 <skill-dir>/scripts/video_gen.py text2video \\\n  --prompt \"<segment_1_prompt>\" \\\n  --quality quality --audio true --size 1920x1080 --fps 30 \\\n  --output-dir <output-dir> --max-wait 900\n```\n\n→ **Send progress message to user**\n\n**Segments 2+ — Image-to-Video (last-frame continuation):**\n\nFor each subsequent segment:\n\n1. Extract last frame from the previous segment's video:\n```bash\n/opt/anaconda3/bin/python3 <skill-dir>/scripts/extract_last_frame.py \\\n  <previous_video.mp4> --output <output-dir>/frame_segN.png\n```\n\n2. Generate next segment using the last frame as input:\n```bash\n/opt/anaconda3/bin/python3 <skill-dir>/scripts/video_gen.py image2video \\\n  --prompt \"<segment_N_prompt>\" \\\n  --image-url <output-dir>/frame_segN.png \\\n  --quality quality --audio true --size 1920x1080 --fps 30 \\\n  --output-dir <output-dir> --max-wait 900\n```\n\n3. → **Send progress message to user**\n\nRepeat for all segments.\n\n**Alternative — Frames-to-Video mode:**\n\nIf you have both a starting and ending image for a segment:\n```bash\n/opt/anaconda3/bin/python3 <skill-dir>/scripts/video_gen.py frames2video \\\n  --prompt \"<description>\" \\\n  --first-frame <first.png> --last-frame <last.png> \\\n  --quality quality --audio true --size 1920x1080 --fps 30 \\\n  --output-dir <output-dir>\n```\n\n### Step 5: Concatenate Segments\n\nAfter all segments are generated, combine them:\n\n```bash\n/opt/anaconda3/bin/python3 <skill-dir>/scripts/concat_videos.py \\\n  --inputs <seg1.mp4> <seg2.mp4> ... \\\n  --output <output-dir>/final_video.mp4\n```\n\nIf the final file exceeds 25MB (Feishu upload limit), compress with ffmpeg:\n```bash\nffmpeg -i <input> -c:v libx264 -crf 32 -c:a aac -b:a 96k -vf \"scale=1280:720\" -y <output>\n```\n\n### Step 6: Deliver\n\n- Share the final video file with the user\n- For Feishu delivery: use feishu-send-file skill to send the .mp4 file\n- Final report:\n\n```\n🎬 **视频生成完成！**\n\n⏱️ 总时长：{duration}秒\n📦 文件大小：{size}MB\n📊 共 {N} 段，总耗时 {total_minutes} 分钟\n```\n\n## Prompt Tips\n\n- Use **English prompts** for best quality (translate Chinese descriptions)\n- Be specific: scene, camera angle, lighting, motion, atmosphere\n- Include style keywords: cinematic, realistic, cartoon, watercolor, etc.\n- For continuation segments, describe the **action progression**, not the full scene from scratch\n- Keep each segment prompt concise (1-3 sentences)\n\n## Parameters Reference\n\n| Parameter | Flag | Default | Options |\n|-----------|------|---------|---------|\n| Prompt | `--prompt` | (required) | Descriptive text |\n| Quality | `--quality` | `quality` | `quality` / `speed` |\n| Audio | `--audio` | `true` | `true` / `false` |\n| Resolution | `--size` | `1920x1080` | `1280x720`, `1920x1080`, `3840x2160` |\n| Frame rate | `--fps` | `30` | `30` / `60` |\n| Output dir | `--output-dir` | `.` | Any writable path |\n| Poll interval | `--poll-interval` | `10` | Seconds |\n| Max wait | `--max-wait` | `900` | Seconds (default raised for reliability) |\n\n## Error Handling\n\n- **Missing ZHIPU_API_KEY**: Ask user to set environment variable\n- **Missing zai-sdk**: `pip install zai-sdk` (under anaconda)\n- **Missing ffmpeg**: Required for frame extraction and concatenation\n- **Task timeout**: Increase `--max-wait` or retry; check task status manually via API\n- **Task failed**: Simplify the prompt and retry\n- **File too large for Feishu**: Compress with ffmpeg (reduce resolution or increase CRF)\n","tags":{"latest":"1.0.1"},"stats":{"comments":0,"downloads":566,"installsAllTime":0,"installsCurrent":0,"stars":0,"versions":2},"createdAt":1773370576520,"updatedAt":1778491872769},"latestVersion":{"version":"1.0.1","createdAt":1773374136541,"changelog":"**This update introduces user progress notifications and time estimates for multi-segment video generation.**\n\n- Added required time estimation and user notification before starting video generation, with detailed guidelines.\n- Introduced progress messages after each segment completes, including completion count, segment description, elapsed and remaining time.\n- Increased default timeout for segment generation from 600s to 900s for improved reliability.\n- Included new steps for file size handling: recommend compressing final video if it exceeds messaging platform limits.\n- Improved instructions for user communication and error handling throughout the workflow.","license":"MIT-0"},"metadata":null,"owner":{"handle":"baokui","userId":"s173j04vn0a1j8xfw7ksgfdkhd843ze0","displayName":"baokui","image":"https://avatars.githubusercontent.com/u/23089857?v=4"},"moderation":null}