视频字幕自动生成器——免费的才是最好的

v2.1.1

自动提取视频音频，识别生成带时间戳的文字稿，输出SRT/VTT字幕及带字幕的视频，并智能提炼视频标题。

⭐ 1· 230·1 current·1 all-time

by@chall2015

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for chall2015/video-processor.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "视频字幕自动生成器——免费的才是最好的" (chall2015/video-processor) from ClawHub.
Skill page: https://clawhub.ai/chall2015/video-processor
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install video-processor

ClawHub CLI

Package manager switcher

npx clawhub@latest install video-processor

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Benign

high confidence

ℹ

Purpose & Capability

The name/description (video→subtitles→burn-in→title extraction) align with the included scripts and SKILL.md. The code uses FFmpeg and Whisper/faster-whisper which are appropriate. Minor inconsistency: SKILL.md and README mention several module files (speech_recognition.py, subtitle_generator.py, title_extractor.py, video_renderer.py) in the example file tree that are not present in the provided manifest — only video_processor.py and subtitle_processor.py are included. Documentation also lists optional features (yt-dlp, stable-diffusion) that are not required by the included scripts. These are likely incomplete/overdocumented rather than malicious.

✓

Instruction Scope

Runtime instructions and the scripts instruct extracting audio, running ASR (faster-whisper or simulated mode), generating SRT/VTT, optionally converting Traditional→Simplified, and calling ffmpeg to burn subtitles — all within the stated purpose. The code reads user-provided video/transcript/style files and writes output files; it does not reference unrelated system paths or unexpected remote endpoints. Documentation suggests optionally setting HF_ENDPOINT for model downloads, but the runtime actions that would reach networks are limited to downloading model files (expected for Whisper).

✓

Install Mechanism

No install spec in the registry (instruction-only). Dependencies are installed via pip/OS package manager according to docs (faster-whisper, openai-whisper, ffmpeg, optionally yt-dlp, stable-diffusion). No arbitrary binary downloads or obscure URLs were found in the provided files. Model weights will be downloaded by the ASR library (faster-whisper/Whisper) from standard model hosting (Hugging Face) which is expected but may fetch large files.

ℹ

Credentials

The skill declares no required environment variables or credentials (good). Documentation mentions HF_ENDPOINT as an optional mirror variable for model downloads; this is optional and not required. There are no requests for unrelated secrets or cloud credentials. Be aware that the ASR library will perform network downloads for model weights and may use the user's home cache directories (e.g., ~/.cache/huggingface/hub).

✓

Persistence & Privilege

The skill does not request persistent/always-on privileges, does not declare always:true, and does not attempt to modify other skills or system-wide agent settings. It runs as invoked and writes output into user-specified output directories only.

Assessment

This skill appears internally consistent for generating subtitles: it uses FFmpeg and Whisper/faster-whisper (expected). Before installing/running: 1) Review the two included scripts yourself (they are the main runtime) and run them on non-sensitive sample videos first. 2) Expect large model downloads (tens to hundreds of MB or more) when using real ASR models — this requires network access and disk space in your user cache (e.g., ~/.cache/huggingface). 3) The docs reference extra modules and optional features (yt-dlp, stable-diffusion) that are NOT present in the package; treat those as optional planned features. 4) No credentials are requested, and the code does not contain obvious exfiltration; nonetheless run it in a sandbox or isolated environment if you are concerned. 5) If you only want offline/safe testing, use the simulated/mock mode (the scripts include a mock transcript path) rather than downloading models.

Like a lobster shell, security has layers — review code before you run it.

ffmpegvk97az084qd3d9cmskny99y7d2s83gv2vlatestvk97az084qd3d9cmskny99y7d2s83gv2vspeech-to-textvk97az084qd3d9cmskny99y7d2s83gv2vsubtitlevk97az084qd3d9cmskny99y7d2s83gv2vvideovk97az084qd3d9cmskny99y7d2s83gv2vwhispervk97az084qd3d9cmskny99y7d2s83gv2v

230downloads

1stars

2versions

Updated 1mo ago

v2.1.1

MIT-0

短视频自动处理技能

创建日期: 2026-03-23
版本: v1.0

✨ 技能概述

自动处理短视频，实现：

🎙️ 语音识别 - 自动提取视频音频并识别为文字
📝 字幕生成 - 生成 SRT/VTT 格式字幕文件
🎯 标题提炼 - 从内容中智能提炼吸引人的标题
📺 字幕添加 - 将字幕烧录到视频中
🎬 视频生成 - 输出带字幕的新视频

🎯 使用场景

1. 自媒体内容二次创作

提取视频内容
添加中文字幕
重新发布到多平台

2. 课程视频处理

自动添加字幕
提炼章节标题
生成学习资料

3. 访谈/会议记录

语音转文字
生成会议纪要
添加时间戳

4. 短视频本地化

识别原语言字幕
翻译为目标语言
烧录新字幕

🚀 快速使用

基础使用

# 处理单个视频
python scripts/video_processor.py \
  --input "video.mp4" \
  --output "./output"

# 处理并添加字幕
python scripts/video_processor.py \
  --input "video.mp4" \
  --output "./output" \
  --add-subtitles

# 处理并提炼标题
python scripts/video_processor.py \
  --input "video.mp4" \
  --output "./output" \
  --generate-title

完整流程

# 完整处理流程
python scripts/video_processor.py \
  --input "input/video.mp4" \
  --output "./output" \
  --extract-audio \
  --recognize-speech \
  --generate-subtitles \
  --add-subtitles \
  --generate-title \
  --export-video

📋 处理流程

1. 视频分析

输入视频
  ↓
提取音频
  ↓
分析视频信息（时长、分辨率等）

2. 语音识别

音频文件
  ↓
语音识别（Whisper/其他 ASR）
  ↓
生成文字稿（带时间戳）

3. 字幕生成

文字稿 + 时间戳
  ↓
生成 SRT/VTT 字幕文件
  ↓
字幕样式配置

4. 标题提炼

完整文字稿
  ↓
AI 提炼关键信息
  ↓
生成多个标题选项

5. 视频生成

原视频 + 字幕文件
  ↓
FFmpeg 烧录字幕
  ↓
输出新视频

⚙️ 技术栈

核心依赖

FFmpeg - 视频处理、字幕烧录
OpenAI Whisper - 语音识别（可选）
Python - 脚本处理

可选依赖

yt-dlp - 下载视频
moviepy - 视频编辑
stable-diffusion - 生成封面图

📁 文件结构

skills/video-processor/
├── SKILL.md                 # 技能说明
├── README.md                # 使用文档
├── scripts/
│   ├── video_processor.py   # 主处理脚本
│   ├── speech_recognition.py # 语音识别模块
│   ├── subtitle_generator.py # 字幕生成模块
│   ├── title_extractor.py   # 标题提炼模块
│   └── video_renderer.py    # 视频渲染模块
├── references/
│   ├── subtitle_styles.md   # 字幕样式配置
│   └── title_templates.md   # 标题模板
├── output/                  # 输出目录
│   ├── subtitles/           # 字幕文件
│   ├── videos/              # 输出视频
│   └── titles/              # 标题文件
└── examples/                # 示例文件

🎨 配置说明

语音识别配置

speech_recognition:
  # 识别引擎
  engine: "whisper"  # whisper | google | azure
  
  # 语言设置
  language: "zh"     # zh | en | ja | ko
  
  # 模型大小
  model: "base"      # tiny | base | small | medium | large
  
  # 输出格式
  output_format: "srt"  # srt | vtt | txt

字幕样式配置

subtitle_style:
  # 字体设置
  font: "Arial"
  font_size: 24
  font_color: "white"
  
  # 边框设置
  border_color: "black"
  border_width: 2
  
  # 位置设置
  position: "bottom"  # top | bottom | center
  margin: 50
  
  # 背景设置
  background: true
  background_color: "black@0.5"  # 50% 透明黑色

标题提炼配置

title_generation:
  # 标题风格
  style: "clickbait"  # normal | clickbait | professional
  
  # 标题长度
  max_length: 30
  
  # 生成数量
  count: 5
  
  # 包含元素
  include_emoji: true
  include_hashtags: true

📊 输出示例

生成的字幕文件 (SRT)

1
00:00:01,000 --> 00:00:03,500
大家好，今天我们来分享一个有趣的话题

2
00:00:03,500 --> 00:00:06,000
关于如何使用 AI 来处理视频内容

3
00:00:06,000 --> 00:00:09,000
这将大大提高我们的工作效率

提炼的标题

标题选项 1:
🔥 AI 自动处理视频，效率提升 10 倍！

标题选项 2:
📹 短视频自动加字幕，这个方法太神了！

标题选项 3:
💡 用 AI 提炼视频标题，解放双手！

标题选项 4:
🎬 视频处理神器，一键生成字幕 + 标题！

标题选项 5:
✨ 自媒体必备！AI 视频处理全流程！

输出文件结构

output/
└── video_20260323_153300/
    ├── original.mp4              # 原视频
    ├── audio.wav                 # 提取的音频
    ├── transcript.txt            # 完整文字稿
    ├── subtitles.srt             # SRT 字幕
    ├── subtitles.vtt             # VTT 字幕
    ├── titles.txt                # 提炼的标题
    ├── video_with_subtitles.mp4  # 带字幕视频
    └── thumbnail.jpg             # 视频封面

🔧 安装说明

1. 安装 FFmpeg

Windows:

# 使用 winget
winget install ffmpeg

# 或下载：https://ffmpeg.org/download.html

macOS:

brew install ffmpeg

Linux:

sudo apt install ffmpeg

2. 安装 Python 依赖

pip install openai-whisper
pip install moviepy
pip install yt-dlp

3. 安装 Whisper（可选）

# 使用 OpenAI Whisper
pip install openai-whisper

# 或使用 faster-whisper（更快）
pip install faster-whisper

🎯 使用示例

示例 1: 基础处理

# 提取字幕
python scripts/video_processor.py \
  --input "interview.mp4" \
  --extract-subtitles \
  --output "./output"

示例 2: 完整流程

# 完整处理
python scripts/video_processor.py \
  --input "course.mp4" \
  --output "./output" \
  --all

示例 3: 批量处理

# 批量处理视频
python scripts/video_processor.py \
  --input "./videos/" \
  --output "./output" \
  --batch

⚠️ 注意事项

视频格式

✅ 支持：MP4, MOV, AVI, MKV, WebM
⚠️ 建议：使用 MP4 格式

音频质量

✅ 清晰语音识别效果好
⚠️ 背景噪音会影响识别

字幕长度

每行不超过 30 字
每段不超过 2 行
显示时间 1-7 秒

处理时间

1 分钟视频 ≈ 1-3 分钟处理
取决于视频长度和配置

📚 相关文档

README.md - 完整使用文档
references/subtitle_styles.md - 字幕样式指南
references/title_templates.md - 标题模板

✅ 功能清单

版本: v1.0
创建日期: 2026-03-23
状态: ⏸️ 框架已完成，待实现

Comments

Loading comments...