Universal Video Analyzer Zh

v1.1.0

通用视频分析器(中文版)- 使用多模态大模型分析视频内容,支持画面识别和语音转文字,生成结构化中文报告。支持豆包、智谱、通义千问等多种模型,用户自行配置 API Key。

1· 125·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for lantianbaicai/universal-video-analyzer-zh.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "Universal Video Analyzer Zh" (lantianbaicai/universal-video-analyzer-zh) from ClawHub.
Skill page: https://clawhub.ai/lantianbaicai/universal-video-analyzer-zh
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install universal-video-analyzer-zh

ClawHub CLI

Package manager switcher

npx clawhub@latest install universal-video-analyzer-zh
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
The name/description (video analyzer using multimodal models) matches the code and SKILL.md: the script extracts frames, transcribes audio with Whisper, and calls a multimodal /chat/completions endpoint. One inconsistency: the registry metadata reported "Required env vars: none", but both SKILL.md and doubao_video_analyzer.py require VIDEO_ANALYZER_API_KEY (and allow optional vars). This mismatch in registry metadata should be corrected but does not change the skill's purpose.
Instruction Scope
The runtime instructions are narrowly scoped to: install dependencies, install ffmpeg, run the script on a video file. The script reads environment variables (VIDEO_ANALYZER_API_KEY, MODEL, BASE_URL, WHISPER-related vars, FRAME_COUNT, etc.) and will upload base64-encoded keyframe images and the transcribed audio text to the configured model API. SKILL.md explicitly warns about this data flow. There is no evidence of hidden or unrelated data collection, but users should note that video frames and transcripts are transmitted to the external model endpoint.
Install Mechanism
There is no automated install spec (instruction-only for OpenClaw) and the code is included. The SKILL.md instructs installing Python packages (torch, openai-whisper, etc.) and ffmpeg; these are expected for local transcription and image processing. Whisper model downloads (~150MB) are performed at runtime—this is large but expected. No suspicious or remote arbitrary binary downloads were found in the package itself.
Credentials
The only required secret is an API key for the multimodal model (VIDEO_ANALYZER_API_KEY), which is appropriate for a skill that calls third-party model APIs. Optional environment variables control model name, base URL, and local whisper model location. The skill does not request unrelated credentials.
Persistence & Privilege
The skill does not request always:true, does not modify other skills or system-wide settings, and does not try to persist elevated privileges. It does recommend ways to store environment variables (including persistent environment variables), which is a convenience suggestion but not a dangerous privilege escalation.
Assessment
This skill will extract video keyframes and audio text and send them (images as base64 data URIs and the transcript) to whatever model API you configure with VIDEO_ANALYZER_API_KEY and VIDEO_ANALYZER_BASE_URL. If your videos contain sensitive or private content, don't use a third-party hosted API key (or use a self-hosted/private model endpoint). Note the repository's registry metadata omitted the required API key even though SKILL.md and the script require it — double-check you set VIDEO_ANALYZER_API_KEY correctly. Installing dependencies (torch, openai-whisper) can download large packages and Whisper models; ensure you have disk space and bandwidth. Finally, verify the BASE_URL and provider you point the API key to (default is a Volcengine endpoint) before sending sensitive data, and prefer temporary or scoped API keys where possible.

Like a lobster shell, security has layers — review code before you run it.

latestvk97bh2y2h4x44jvg4ympdqene184k7nx
125downloads
1stars
2versions
Updated 2w ago
v1.1.0
MIT-0

安装前置依赖

# 1. 安装 Python 依赖
pip install requests openai-whisper torch tenacity Pillow python-dotenv

# 2. 安装 ffmpeg(必需)
# Windows: winget install Gyan.FFmpeg
# macOS: brew install ffmpeg
# Linux: sudo apt install ffmpeg

# 3. 首次运行会自动下载 Whisper 模型(约150MB),请确保网络畅通

⚠️ 数据隐私说明

本技能会将视频关键帧图片(base64编码)和语音转写文本发送到你配置的多模态模型API。请确认:

  • 你接受将视频内容发送到所选模型服务商
  • 敏感/私密视频请谨慎使用,或选择私有化部署的模型
  • 分析结果仅保存在本地,不会上传到其他地方

触发条件

当用户发送视频文件(.mp4, .mov等)并希望分析内容时,自动触发此技能。

执行命令

python doubao_video_analyzer.py "{{video_path}}"

配置说明

必需配置

设置环境变量 VIDEO_ANALYZER_API_KEY 为你的多模态模型 API Key。

可选配置

环境变量说明默认值
VIDEO_ANALYZER_MODEL使用的模型名称doubao-seed-2-0-pro-260215
VIDEO_ANALYZER_BASE_URLAPI 基础地址https://ark.cn-beijing.volces.com/api/v3
WHISPER_MODEL_DIRWhisper 模型本地路径自动下载

支持的模型示例

模型提供商MODEL 值BASE_URL
豆包doubao-seed-2-0-pro-260215https://ark.cn-beijing.volces.com/api/v3
智谱 GLM-4Vglm-4v-plushttps://open.bigmodel.cn/api/paas/v4
通义千问 VLqwen-vl-plushttps://dashscope.aliyuncs.com/compatible-mode/v1

功能特点

双轨分析:同时分析视频画面 + 语音转文字,生成完整报告 ✅ 模型无关:支持多种多模态模型,用户自由选择 ✅ 结构化输出:自动生成场景、核心信息、亮点等结构化内容 ✅ HTML可视化报告:自动生成精美排版的HTML报告,含关键帧展示、分析结果、语音文字稿 ✅ 国内可用:支持豆包、智谱、通义等国内模型,无需翻墙 ✅ 容错完善:ffmpeg错误检查、API超时保护、跨平台路径兼容

输出文件

每次分析会自动生成以下文件:

  • {视频名}_分析报告.md — Markdown格式报告
  • {视频名}_分析报告.html — HTML可视化报告(可直接用浏览器打开)
  • {视频名}_frames/ — 提取的关键帧图片

Comments

Loading comments...