Video Captions
PassAudited by ClawScan on May 1, 2026.
Overview
This is a coherent captioning skill; its main install-time considerations are optional cloud transcription uploads, optional API keys, and user-directed package installs.
Safe to consider for local caption generation. Before installing or using it, decide whether you want fully local processing or cloud transcription, protect any API keys, and verify file paths before running ffmpeg or whisper commands.
Findings (4)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
If used on the wrong file or output path, local media files could be processed or overwritten unintentionally.
The skill instructs the agent/user to run local media-processing commands that read video/subtitle files and create rendered outputs.
ffmpeg -i video.mp4 -vf "subtitles=video.srt:force_style='FontName=Arial..." output.mp4
Confirm input and output paths before running ffmpeg or whisper commands, and keep original media backups for important projects.
Installing packages can introduce normal dependency supply-chain risk, even though the packages are relevant to the captioning purpose.
The documentation recommends installing external Python packages, but the examples do not pin versions or provide lockfiles.
pip install openai-whisper ... pip install mlx-whisper ... pip install whisper-timestamped ... pip install stable-ts
Install dependencies in a virtual environment, prefer trusted package sources, and pin versions for repeatable production workflows.
Using cloud engines may consume paid API quota and gives the configured provider account authority to process the submitted media.
Cloud transcription options require user-provided provider API keys, which can authorize paid account usage.
# Requires ASSEMBLYAI_API_KEY export ASSEMBLYAI_API_KEY=your_key
Only configure the API keys you intend to use, monitor provider usage, and avoid placing secrets in shared chat transcripts or reusable files.
Private audio or video content could leave the local machine if the user chooses a cloud transcription workflow.
The optional Deepgram workflow sends the media file to an external transcription provider.
curl -X POST "https://api.deepgram.com/v1/listen?model=nova-2" ... --data-binary @video.mp4
Use the default local Whisper workflow for sensitive media, and use cloud engines only when the provider, cost, and data-handling terms are acceptable.
