Visual models analyze video to generate reports and highlight frames, provided by the Vidu API.

v1.0.1

Extract and analyze keyframes from MP4, MOV, AVI videos to identify themes, generate reports, and provide 3 representative screenshots.

⭐ 0· 137·1 current·1 all-time

byVidu AI@x-jihua

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Suspicious

medium confidence

Purpose & Capability

The SKILL.md and shipped script rely on ffmpeg/ffprobe for extraction and reference a Feishu inbound path and sending output via Feishu, but the skill metadata lists no required binaries, env vars, or config paths. The use of ffmpeg/ffprobe is legitimate for video processing, and Feishu integration can be reasonable, but the metadata omission is an incoherence that could lead to runtime failures or hidden assumptions about platform integrations.

ℹ

Instruction Scope

The runtime instructions stay within the stated purpose (download video, extract keyframes, analyze images, send report). They reference a specific agent filesystem path (~/.openclaw/media/inbound and ~/.openclaw/media/keyframes) and instruct sending results via Feishu. The instructions do not ask to read unrelated files or export data to unknown network endpoints, but they implicitly rely on platform-level Feishu messaging capabilities and an 'image' vision tool (which will send frames to whatever model backend the agent uses).

ℹ

Install Mechanism

This is instruction-only with a small helper script — no install spec, which reduces supply-chain risk. However, the skill requires ffmpeg/ffprobe to be present on PATH; that dependency is not declared in metadata. Because extract_keyframes.sh invokes ffmpeg directly, the operator should ensure ffmpeg is installed from a trusted source.

Credentials

The skill declares no required environment variables or credentials, yet the SKILL.md mentions sending output via Feishu. If sending via Feishu requires credentials or tokens on the agent, those are not declared here. Additionally, the analysis step uses an 'image' vision tool — processing keyframes will transmit image data to the configured model backend, which may expose sensitive visual content; this risk is expected for this kind of skill but should be acknowledged and matched to declared policies/credentials.

✓

Persistence & Privilege

The skill does not request always:true, does not modify other skills or system-wide settings, and only writes to its own output directory (~/.openclaw/media/keyframes). The script clears keyframe_*.jpg files in its output directory but does not attempt to alter other config files or credentials.

What to consider before installing

Before installing or enabling this skill: - Confirm ffmpeg/ffprobe are installed and from a trusted package (the script depends on these but the metadata does not declare them). - Verify how Feishu integration is handled on your agent: if the skill expects to send messages via Feishu, ensure appropriate credentials/tokens are present and intentional — the skill metadata does not list any Feishu env vars. - Understand that the 'image' vision analysis will send extracted frames to whatever model/backend the agent is configured to use; do not analyze sensitive or private video content unless you trust that backend. - Review and, if desired, run the included extract_keyframes.sh in a safe test environment to confirm it behaves as expected (it appears benign: it validates input, creates an output dir, clears keyframe files in that dir, and invokes ffmpeg). - Consider asking the skill author (or the registry owner) to update metadata to list required binaries (ffmpeg/ffprobe) and to clarify any required platform credentials (Feishu) before trusting it in production.

Like a lobster shell, security has layers — review code before you run it.

latestvk975784m9eyvqwfrs0jvmk9411839frw

137downloads

0stars

1versions

Updated 4w ago

v1.0.1

MIT-0

Video Analyzer

Overview

Extract keyframes from videos, analyze content with vision models, and generate comprehensive reports with 3 representative screenshots. Optimized for token efficiency using I-frame detection.

Workflow

Video Input → Extract Keyframes → Vision Analysis → Select Top 3 → Generate Report → Send Output

Step-by-Step Process

1. Download Video (if from Feishu)

When user sends video via Feishu, the file is auto-saved to:

~/.openclaw/media/inbound/<filename>.mp4

2. Extract Video Metadata

ffmpeg -i <video_path> 2>&1 | grep -E "(Duration|Video)"

Returns: duration, resolution, bitrate, codec info.

3. Extract Keyframes

Use the provided script for optimal keyframe extraction:

bash ~/.openclaw/workspace/skills/video-analyzer/scripts/extract_keyframes.sh <video_path> [output_dir]

Parameters:

video_path: Path to video file (required)
output_dir: Output directory (optional, defaults to ~/.openclaw/media/keyframes/)

Output: JPEG images at 640px width, named keyframe_XX.jpg

Token efficiency: Uses I-frame detection to extract only meaningful frames, reducing token consumption by ~7% vs uniform sampling.

4. Analyze with Vision Model

Use the image tool with all extracted keyframes:

prompt: "Analyze these keyframes from a video. Please:
1. Describe the video's theme and content
2. Select 3 most representative frames (explain why)"

5. Generate Report

Structure the analysis report:

## 📌 Video Theme
[Description]

## 🖼️ Representative Screenshots
| Frame | Reason |
|-------|--------|
| frame_XX | [Why representative] |

6. Send Output

Send via Feishu:

Analysis report (text message)
3 representative screenshots (image messages)

Token Consumption Reference

Video Length	Keyframes	Estimated Tokens
5 seconds	5-8	~8,000-14,000
15 seconds	12-16	~20,000-28,000
30 seconds	20-30	~35,000-50,000

Optimization tips:

Images account for 95%+ of tokens
Shorter videos = fewer tokens
Low-motion videos produce fewer keyframes

Resources

scripts/

extract_keyframes.sh - Extract keyframes using ffmpeg I-frame detection

references/

ffmpeg_reference.md - Advanced ffmpeg commands for video processing

Comments

Loading comments...