视频转操作指南

v1.0.0

将操作视频自动转换为图文并茂的Word操作指南文档，支持智能截图、语音转录、LLM内容提炼和流程图生成

⭐ 0· 18·0 current·0 all-time

by@siyou315

MIT-0

Download zip

LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Benign

high confidence

✓

Purpose & Capability

Name/description (video → Word guide) match the included scripts: frame extraction, transcription, frame analysis, LLM refinement, and doc generation. The SKILL.md and code reference only relevant tools (ffmpeg, whisper/faster-whisper, python-docx, read_image) consistent with the stated capability.

✓

Instruction Scope

Runtime instructions focus on extracting frames, transcribing audio, calling the platform's read_image for visual analysis, merging results, and generating a Word doc. They instruct the main agent to read files under the frames directory and run read_image on those images — this is expected for the stated task and does not ask for unrelated system data or arbitrary file paths.

✓

Install Mechanism

No install spec in registry (instruction-only), but the repository includes a sensible install script (scripts/install_deps.sh) that uses apt/brew and pip to install expected dependencies. There are no downloads from unknown hosts or URL shorteners; no extracted arbitrary archives. Installation behavior is proportionate to the task.

ℹ

Credentials

The skill declares no required env vars, which matches registry metadata. Several scripts optionally use third-party API keys (ANTHROPIC_API_KEY, OPENAI_API_KEY) for improved LLM refinement, and SKILL.md mentions optional OpenAI Whisper API and local faster-whisper. These optional keys are reasonable for higher-quality processing but are not required for basic local operation.

✓

Persistence & Privilege

Skill is not always-enabled (always: false) and uses normal autonomous invocation settings. It does not request system-wide configuration changes or other skills' credentials. The skill operates on local files it creates (frames, transcripts) and does not attempt to modify agent configuration.

Assessment

This package appears coherent for converting tutorial videos into Word guides. Before installing: - Ensure you have ffmpeg and Python available; the included install script uses apt/brew and pip to add expected packages. - read_image is referenced as a platform built-in: confirm your agent environment provides that tool before relying on the main-dialog analysis step. - Optional API keys (ANTHROPIC_API_KEY, OPENAI_API_KEY, and OpenAI Whisper API key) improve LLM/refinement performance; do not provide keys unless you trust the runtime environment. Keys are used by scripts to call remote LLMs — providing them grants those scripts access to those services. - The scripts process local files (extracted frames, audio, transcripts). Make sure you run the skill only on videos you are allowed to process (they may contain sensitive UI or PII). - If you need stricter isolation, run the pipeline locally without supplying external API keys and avoid enabling remote model calls.

Like a lobster shell, security has layers — review code before you run it.

latestvk973h149hrtr06hxhnxh17c5w584mz7e

License

MIT-0

Free to use, modify, and redistribute. No attribution required.

Termshttps://spdx.org/licenses/MIT-0.html

视频转操作指南

License

Comments