Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

VoxCPM中文配音

v1.0.0

🎯 **唯一使用VoxCPM的中文配音技能** - 外语视频一键中文配音,支持硬字幕检测、断点续传、智能BGM。触发场景:(1) 用户需要给外语视频配音 (2) 视频翻译需求 (3) 多语言内容本地化

0· 126·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for newaiguy/video-dubbing.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "VoxCPM中文配音" (newaiguy/video-dubbing) from ClawHub.
Skill page: https://clawhub.ai/newaiguy/video-dubbing
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install video-dubbing

ClawHub CLI

Package manager switcher

npx clawhub@latest install video-dubbing
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Suspicious
medium confidence
!
Purpose & Capability
The declared purpose (VoxCPM-based Chinese dubbing) matches the core code (whisper transcribe, translate, VoxCPM TTS, ffmpeg processing). However the repository contains an upload_bilibili.py script that reads Bilibili credentials from a hard-coded local path and references Windows-specific tools/paths. Uploading is not documented in SKILL.md as requiring local credential files at that path, and the registry metadata declared no required env vars while SKILL.md and code require TRANSLATE_API_KEY and VOXCPM_DIR — an inconsistency.
!
Instruction Scope
SKILL.md instructs the agent to call translation and vision APIs and to clone VoxCPM, which aligns with the code. But the code does extra things not clearly documented in the runtime instructions: scripts/upload_bilibili.py will load credentials from D:/openclaw_workspace/credentials/bilibili.json and call Bilibili upload code; this reads a local secret file and invokes external network operations unrelated to core dubbing flow. detect_hard_subtitle posts a base64 image to a vision API (expected), and the translation/vision endpoints default to a third-party (SiliconFlow). These behaviors are not fully documented in the top-level registry metadata.
Install Mechanism
No install spec; recommended pip installs and git clone of VoxCPM are reasonable and use known hosts (PyPI, GitHub, ModelScope). No arbitrary binary downloads or obscure URL extracts were specified.
!
Credentials
The core skill reasonably needs a translation API key and a local VoxCPM model dir. However the upload script accesses a hard-coded local credential file (D:/openclaw_workspace/credentials/bilibili.json) and uses bilibili_api credentials that are neither declared in SKILL.md's required credentials nor in the registry metadata. That gives the package the ability to access unrelated local secrets if the user runs the uploader and the file exists. Using a single API key for both translate and vision calls (code passes the same key) is another subtle mismatch with the doc's separate endpoints.
Persistence & Privilege
The skill is not marked 'always: true' and has no special platform privileges. It does not appear to modify other skills. The main concern is runtime behavior (reading a hard-coded local credential file and performing uploads) rather than persistent elevated privileges.
What to consider before installing
This package implements a plausible VoxCPM-based dubbing pipeline, but inspect and/or remove the uploader before running. Specific actions to consider: - Review scripts/upload_bilibili.py: it reads credentials from a hard-coded path (D:/openclaw_workspace/credentials/bilibili.json) and will attempt to upload videos. If you don't want uploads or don't trust that behavior, delete or sandbox this file. - Confirm which API endpoints you will use for translation/vision (the defaults point to a third party, SiliconFlow). Only provide API keys that you trust to those endpoints; the code will send base64-encoded frames to the vision endpoint. - Fix the metadata mismatch: registry lists no required env vars but SKILL.md and code expect TRANSLATE_API_KEY and VOXCPM_DIR — ensure you supply only necessary secrets. - Run the tool in an isolated environment (container or VM) and avoid placing sensitive credentials at the hard-coded path. Replace hard-coded paths with explicit, documented config or environment variables. - If you plan to publish/upload, audit the uploader and any third-party libraries (bilibili_api) and adjust paths (ImageMagick/ffmpeg) to match your system. If you want, I can point to the exact lines that read the hard-coded credential path and suggest a safe patch to remove or parameterize the uploader.

Like a lobster shell, security has layers — review code before you run it.

latestvk97akd0s8nf99mxf89t9xpkgax83n5qq
126downloads
0stars
2versions
Updated 1mo ago
v1.0.0
MIT-0

🎬 VoxCPM中文视频配音

唯一使用VoxCPM开源模型的中文配音技能

生产环境验证 ✅ | 断点续传 ✅ | 智能BGM ✅

🌟 核心卖点

特性说明
🎯 VoxCPM独家唯一集成VoxCPM开源TTS模型的中文配音技能
生产验证已在B站成功发布4个视频
🔄 断点续传中断后可继续,无需重新生成
🔍 硬字幕检测AI自动检测并覆盖原字幕
🎵 智能BGM自动循环、交叉淡入淡出

📋 完整流程

1. Whisper转写    → medium模型转写 + 时间戳
2. AI翻译        → 腾讯混元MT翻译模型
3. 分组TTS       → VoxCPM配音(按组生成,保持连贯)
4. 音频匹配      → 智能拉伸/加静音
5. 硬字幕检测    → AI自动检测是否需要遮盖
6. 字幕生成      → 中文字幕(自动换行)
7. 视频合并      → GPU加速编码

🚀 快速开始

1. 安装依赖

# Python依赖
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install openai-whisper soundfile scipy librosa requests

# VoxCPM(从官方获取)
git clone https://github.com/modelscope/VoxCPM.git

2. 配置

复制配置模板:

cp config.example.json config.json

编辑 config.json

{
  "work_dir": "./workspace",
  "voxcpm_dir": "./VoxCPM",
  "ffmpeg_path": "ffmpeg",
  "translate": {
    "api_url": "https://api.siliconflow.cn/v1/chat/completions",
    "api_key": "YOUR_API_KEY",
    "model": "tencent/Hunyuan-MT-7B"
  },
  "vision": {
    "api_url": "https://api.siliconflow.cn/v1/chat/completions",
    "model": "Qwen/Qwen2.5-VL-72B-Instruct"
  },
  "tts": {
    "reference_audio": "./reference_audio/speaker.wav",
    "reference_text": "参考音频对应的文本"
  }
}

注意: 所有配置项均可通过环境变量覆盖,优先级:环境变量 > config.json > 默认值


### 3. 运行

```bash
python scripts/dubbing.py your_video.mp4

输出:

  • workspace/output/your_video_dubbed.mp4 - 配音视频
  • workspace/output/your_video.srt - 字幕文件

⚙️ 参数说明

Whisper参数

参数默认值说明
whisper.modelmediumWhisper模型大小
whisper.languageen源语言

TTS参数

参数默认值说明
tts.max_group_duration15.0每组最大时长(秒)
tts.inference_timesteps10推理步数
tts.cfg_value2.0CFG值

字幕参数

参数默认值说明
subtitle.fontsize16字体大小
subtitle.fontnameSimHei字体名称
subtitle.outline2描边宽度

🎵 BGM添加

python scripts/add_bgm.py <视频> [BGM文件] [输出文件]

特性:

  • BGM自动循环(交叉淡入淡出3秒)
  • 音量控制(默认12%)
  • 自动淡入淡出

🔧 高级用法

测试模式

只处理前30秒:

python scripts/dubbing.py video.mp4 --test 30

指定输出名

python scripts/dubbing.py video.mp4 --output my_video

自定义配置

python scripts/dubbing.py video.mp4 --config my_config.json

📁 文件结构

video-dubbing/
├── SKILL.md              # 本文档
├── config.example.json   # 配置模板
├── scripts/
│   ├── dubbing.py       # 主流程脚本
│   ├── add_bgm.py       # BGM添加
│   └── upload_bilibili.py # B站上传
└── reference_audio/      # TTS参考音频
    └── speaker.wav

🔑 环境变量

变量说明默认值
TRANSLATE_API_KEY翻译API密钥(必需)-
VOXCPM_DIRVoxCPM目录./VoxCPM
WORK_DIR工作目录./workspace
REFERENCE_AUDIOTTS参考音频路径./reference_audio/speaker.wav
REFERENCE_TEXT参考音频对应文本-
TRANSLATE_API_URL翻译API端点SiliconFlow
TRANSLATE_MODEL翻译模型tencent/Hunyuan-MT-7B
VISION_API_URL硬字幕检测API端点SiliconFlow
VISION_MODELVision模型Qwen/Qwen2.5-VL-72B-Instruct
WHISPER_MODELWhisper模型medium
WHISPER_LANGUAGE源语言en
FFMPEG_PATHffmpeg路径ffmpeg

📊 音频匹配质量

ratio范围方法质量
< 0.85加静音✅ 无损
0.85-1.15resample✅ 轻微调整
> 1.15librosa加速⚠️ 轻微失真

实测:60%+组无损音质

⚠️ 注意事项

AV1编码视频

AV1编码视频需要重新编码:

# 使用GPU编码
-c:v h264_nvenc

# 或CPU编码
-c:v libx264

VoxCPM模型

需要从ModelScope获取VoxCPM模型:

# 下载模型到指定目录
modelscope download --model modelscope/VoxCPM --local_dir ./VoxCPM

📜 许可证

MIT License

🙏 致谢


🎯 选择VoxCPM中文配音的理由:

  1. 开源免费,无商业限制
  2. 中文效果最佳,自然流畅
  3. 支持声音克隆(参考音频)
  4. 本地运行,数据安全

Comments

Loading comments...