Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

VN Skill

v0.1.0

Local video, audio and image processing expert for macOS, powered by VN Video Editor. Use this skill whenever the user wants to process video, audio or image...

1· 93·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for cawcut/vn-skill.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "VN Skill" (cawcut/vn-skill) from ClawHub.
Skill page: https://clawhub.ai/cawcut/vn-skill
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install vn-skill

ClawHub CLI

Package manager switcher

npx clawhub@latest install vn-skill
Security Scan
Capability signals
CryptoRequires sensitive credentials
These labels describe what authority the skill may exercise. They are separate from suspicious or malicious moderation verdicts.
VirusTotalVirusTotal
Suspicious
View report →
OpenClawOpenClaw
Benign
medium confidence
Purpose & Capability
The name/description (VN Video Editor helper for macOS) matches the commands and usage in SKILL.md and the CLI reference. One minor inconsistency: the registry metadata at the top said "No install spec" but SKILL.md includes an install entry that downloads a vnapp-cli release. Functionally, the requested resources (a VN CLI bridge and the VN app) make sense for the stated purpose.
Instruction Scope
Instructions stick to local file processing via a vnapp-cli binary and VN app and only reference local file paths and a local VN MCP server. However, the auto-captions option documents that some engines (whisper_turbo/whisper_medium) may download models on first use, which implies network activity beyond purely local disk I/O; SKILL.md still claims "no cloud upload, no API key required." The skill avoids asking for unrelated files or credentials.
Install Mechanism
SKILL.md metadata includes a download install that extracts a zip from a GitHub releases URL into ~/.openclaw/tools/vnapp-cli/. Using a GitHub release is reasonable, and a sha256 is provided, but the install writes and extracts binaries to the user's home directory — this raises the usual caution: extracted executables will run with the user's privileges, so verify the release and sha256 before trusting it.
Credentials
The skill requests no environment variables, no secrets, and no system config paths beyond installing to its own ~/.openclaw/tools directory. That is proportionate to the stated functionality.
Persistence & Privilege
always:false and default model-invocation are used. The skill does not request permanent/always-on privilege or modify other skills. Its install writes to a per-skill tools directory only.
Assessment
This skill appears to be a coherent VN Video Editor CLI wrapper for macOS, but before installing: 1) verify the GitHub release and the provided sha256 checksum for the vnapp-cli zip and ensure you trust the repository owner; 2) be aware the installer will extract and place an executable in ~/.openclaw/tools/vnapp-cli — that binary will run with your user privileges; 3) auto-captions using certain engines may download speech models the first time (network activity), so expect model downloads; 4) ensure VN Video Editor is installed and its MCP server is enabled; 5) if you are unsure, review the zip contents on a separate machine or run the tool in a sandboxed environment. If you want higher assurance, request the upstream source code or a signed binary from a publisher you trust.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

🎬 Clawdis
OSmacOS
latestvk976fe818t92z8ygecph6tteh585874k
93downloads
1stars
1versions
Updated 6d ago
v0.1.0
MIT-0
macOS

VN Video Editor Skill

Process video, audio, and images locally on macOS using VN Video Editor via the vnapp-cli bridge.

Introducing the Skill

When the user asks what this skill can do, or when this skill is first triggered in a session, introduce it with:

🎬 VN Video Editor Skill is ready!

I can process video, audio, and images locally on your Mac — no cloud upload, no API key needed.

What I can do:

  • 🎤 Auto-generate captions / burn SRT subtitles
  • 🗜️ Compress video or images
  • 🔇 Denoise audio or video
  • ✂️ Extract audio tracks or frame thumbnails
  • 🔗 Merge / concatenate video clips with transitions
  • 🖼️ Remove background from images or videos (Apple Silicon only)

About drafts: For the following operations, the VN draft is kept by default after export so you can re-edit in VN Video Editor: Auto-captions, Burn SRT, Compress video, Merge video, Video cutout

If you don't want to keep drafts, just tell me once and I'll default to no draft for the rest of this session. You can also override per task at any time.


Draft Preference (session-scoped)

  • Default: --keep-draft (draft kept after export)
  • If the user says they don't want to keep drafts (e.g. "don't keep drafts", "no draft", "skip draft") → set session preference to no draft; apply --no-keep-draft for all subsequent draft-supporting commands
  • If the user explicitly says "keep draft" for a specific task → apply --keep-draft for that task only, regardless of session preference
  • Only commands that support drafts are affected: auto-captions, add-caption, compress-video, concat-video, cutout-video
  • Commands that do not support drafts are never affected: extract-audio, extract-frame, compress-image, denoise, cutout-image

Parameter Clarification Policy

When the user provides ambiguous or incomplete parameters for any option (not just style), ask one focused question to clarify before proceeding. When parameters are clear and unambiguous, proceed directly.

Ask for clarification when:

  • A single color value is given without specifying which property it applies to.
    • e.g. "add captions, purple" — unclear whether purple is text color or stroke color. Ask: "Should purple be the text color or the stroke color?"
  • A number is given without context for what it controls.
    • e.g. "compress, 10" — unclear whether 10 means quality, resolution, or something else. Ask: "Should 10 be the quality (0.0–1.0 scale), or something else like max size?"
  • A required pair of values is partially missing and the intent cannot be inferred.
    • e.g. "white stroke" with no text color — ask what the text color should be.

Do NOT ask for clarification when:

  • Multiple distinct properties are clearly assigned.
    • e.g. "dark gray text, green stroke" — two separate properties, no ambiguity.
    • e.g. "purple stroke, width 1pt" — role of purple is explicit.
  • The request maps unambiguously to a single parameter.
    • e.g. "compress to 720p", "extract frame at 5s", "denoise with high level".

Rule: Only ask when the user's intent genuinely cannot be determined. Do not ask about optional parameters unless they are missing and required for the task to proceed. Ask one short, specific question per ambiguity.


When to use

Trigger this skill when the user wants to:

  • Extract audio from a video
  • Extract a frame / thumbnail / cover image from a video
  • Add auto-generated captions to a video
  • Burn SRT subtitles into a video
  • Compress a video or image
  • Concatenate / merge video clips
  • Denoise audio or video
  • Remove background from an image or video (portrait cutout / background removal)

Do NOT trigger this skill when:

  • The user is asking general questions about VN Video Editor (answer directly)
  • The user wants to edit a video timeline, add effects, or do creative editing (VN skill only handles export operations, not timeline editing)
  • The input file is not a local file (URLs, cloud files not supported)
  • The platform is not macOS

Required inputs

Before running any command, confirm you have all required inputs:

CommandRequiredAsk if missing
extract-audiovideo file pathyes
extract-framevideo file pathyes
auto-captionsvideo file pathyes; engine defaults to whisper_turbo
add-captionvideo file path + SRT file pathyes, ask for SRT path
compress-videovideo file pathyes; resolution/fps/bitrate are optional
compress-imageimage file pathyes; format/quality are optional
concat-videoat least 2 video file pathsyes
denoiseaudio or video file pathyes; level defaults to moderate
cutout-imageimage file pathyes; quality defaults to accurate
cutout-videovideo file pathyes
compress-video-estimatevideo file pathyes; all other params optional

Stop and ask the user when:

  • The input file path does not exist or was not provided
  • The user requests add-caption but no SRT file path was given
  • The user requests concat-video with fewer than 2 files
  • The user requests denoise --level custom but no --custom-value was provided
  • The user requests cutout-image but the quality intent is ambiguous (e.g. just "fast cutout" without clear context — ask: "Should I use accurate (best quality), balanced, or fast quality for the cutout?")
  • The requested operation is outside the scope of this skill (e.g. timeline editing, adding effects)

Do not guess or assume missing required inputs — always ask first.


Optional Parameters Prompt

For commands with meaningful optional parameters, ask the user before running. Keep the question brief and friendly. If the user says "default" or "just go ahead", use defaults immediately.


auto-captions

Ask:

Your video is ready for auto-captioning. Which recognition engine would you like to use?

EngineAccuracySpeedNotes
whisper_turbo (default)GoodFastOne-time download required on first use
whisper_mediumBetterSlowerOne-time download required on first use
whisper_baseFairFastBuilt-in, no download needed
whisper_tinyBasicFastestBuilt-in, no download needed

Language — source language spoken in the video (default: auto). Supported: auto en zh ja ko es pt ar hi id fr de ru it tr vi th pl uk nl sv fi ro cs hu he el bg hr sk sl lt lv et ms ta ur sw mk mi is hy az af

Style (optional, ask only if user wants to customise):

  • Font size (default: 37.4pt)
  • Text color (default: white #FFFFFF)
  • Stroke color (default: black #000000)
  • Stroke width (default: 0.5pt)

Keep draft — keep the VN draft after export for re-editing? (default: on; pass --no-keep-draft only if the user explicitly asks to delete it)


add-caption

Ask:

SRT file received. Would you like to customise the subtitle style, or use a preset?

Preset styles (specify by ID):

IDText colorStroke colorStroke widthBackgroundEffect
default_0 (default)WhiteBlack4NoneStandard white text, black stroke
caption_1WhiteBlack10NoneBold black stroke
caption_2BlackWhite10NoneBold white stroke
caption_11WhiteBlack20NoneExtra-bold black stroke
caption_12BlackWhite20NoneExtra-bold white stroke
caption_21WhitePurple #D834DB3NonePurple stroke
caption_22WhiteCyan #34DBDB3NoneCyan stroke
caption_23Dark orange #6D340EWhite20NoneWarm tone
caption_31WhiteNone0Solid black #000000Black background
caption_32BlackNone0Solid white #FFFFFFWhite background
caption_33WhiteNone0Semi-transparent blackSemi-transparent black bg
caption_34BlackNone0Semi-transparent whiteSemi-transparent white bg
caption_35BlackYellow #FFFC5410Semi-transparent blackYellow stroke + black bg
caption_36WhiteBlack10Semi-transparent whiteBlack stroke + white bg

Or customize manually (optional):

  • Font size (default: 13pt)
  • Text color (hex, e.g. #FFFFFF)
  • Stroke color (hex)
  • Stroke width (pt)
  • Background color (hex)
  • Opacity (0.0–1.0)

Keep draft — keep the VN draft after export for re-editing? (default: on; pass --no-keep-draft only if the user explicitly asks to delete it)


compress-video

Ask:

Ready to compress. Any specific settings, or use defaults?

OptionDescriptionDefault
Resolution144p 240p 360p 480p 540p 720p 1080p 2.7K 4K 5K 6K 8K or custom short-edge (e.g. 720)Same as source
Frame rate6 12 24 25 30 50 60 fpsSame as source
BitrateManual kbps, e.g. 6000 (~6 Mbps), range 100–300000 kbpsAuto (calculated from resolution × frame rate)
Codech264 (best compatibility) / hevc (better compression; required for HDR)h264
Output formatmp4 / movmp4
HDRForce HDR output (auto-switches to HEVC)Off

Auto bitrate reference:

Resolution24fps30fps60fps
480p2133k2667k4053k
720p3520k5000k6400k
1080p5333k9000k9600k
2.7K10667k12800k25600k
4K14400k18133k42667k
5K23467k29867k74667k
8K53333k68267k170667k

HDR mode adds ~12.5% to the above values. Audio bitrate is fixed at 256k and cannot be changed.

Keep draft — keep the VN draft after export for re-editing? (default: on; pass --no-keep-draft only if the user explicitly asks to delete it)


compress-image

Ask:

Ready to compress. Any specific settings, or use defaults?

OptionDescriptionDefault
FormatOutput format: jpeg, png, webp, heicjpeg (unless user specifies otherwise)
Quality0.0 (smallest file) to 1.0 (best quality); ignored for PNG0.8
Max widthResize so width does not exceed this many pixelsOriginal
Max heightResize so height does not exceed this many pixelsOriginal
Keep aspect ratioMaintain original proportions when resizingOn

Format rule: Default output format is always jpeg regardless of input format, unless the user explicitly specifies a format. If the user says "keep as PNG" or specifies webp / heic, use that format instead.


concat-video

Ask:

Ready to merge the clips. A few options:

OptionDescriptionDefault
Keep draftKeep the VN draft after export so you can re-edit it laterOn (draft kept by default; only skip if user explicitly asks)

Would you like a transition between them?

Transition duration — 0.2–5.0 s (step 0.1s). Each clip must be at least as long as the transition duration (some effects require 2×).

Basic transitions:

StyleDefault (s)Max (s)Min clip lengthDescription
none (default)0NoneHard cut, no transition
fade_black0.85.0≥ durationFade through black
fade_white0.85.0≥ durationFade through white
dissolve0.85.0≥ duration×2Cross dissolve
color_difference_dissolve0.85.0≥ duration×2Color-difference dissolve
blur0.85.0≥ duration×2Gaussian blur transition
pixelate0.85.0≥ duration×2Pixelate transition
zoom_blur0.85.0≥ durationZoom-in with motion blur
zoom_blur_reverse0.85.0≥ durationZoom-out with motion blur
rotate_zoom0.85.0NoneRotate and zoom in
rotate_zoom_reverse0.85.0NoneRotate and zoom out
rotate_blur0.85.0≥ durationRotate with blur
rotate_blur_reverse0.85.0≥ durationReverse rotate with blur
spin0.85.0≥ durationSpin clockwise
spin_reverse0.85.0≥ durationSpin counter-clockwise
blink0.45.0NoneQuick flicker cut
floodlight0.85.0NoneBright flash between clips
circle_crop_center0.65.0≥ duration×2Circle opens from center
circle_crop_center_reverse0.65.0≥ duration×2Circle closes to center
slide_from_top0.65.0≥ duration×2New clip slides in from top
slide_from_left0.65.0≥ duration×2New clip slides in from left
slide_from_bottom0.65.0≥ duration×2New clip slides in from bottom
slide_from_right0.65.0≥ duration×2New clip slides in from right
wipe_from_top0.65.0≥ duration×2Wipe in from top
wipe_from_left0.65.0≥ duration×2Wipe in from left
wipe_from_bottom0.65.0≥ duration×2Wipe in from bottom
wipe_from_right0.65.0≥ duration×2Wipe in from right
push_from_top0.65.0≥ duration×2Both clips push upward
push_from_left0.65.0≥ duration×2Both clips push leftward
push_from_bottom0.65.0≥ duration×2Both clips push downward
push_from_right0.65.0≥ duration×2Both clips push rightward
reveal_vertical0.85.0≥ durationVertical reveal
reveal_horizontal0.85.0≥ durationHorizontal reveal
shake_from_top0.85.0NoneMotion blur from top
shake_from_left0.85.0NoneMotion blur from left
shake_from_bottom0.85.0NoneMotion blur from bottom
shake_from_right0.85.0NoneMotion blur from right
shake_from_top_left0.85.0NoneMotion blur from top-left
shake_from_bottom_left0.85.0NoneMotion blur from bottom-left
shake_from_bottom_right0.85.0NoneMotion blur from bottom-right
shake_from_top_right0.85.0NoneMotion blur from top-right

Matte transitions (shape cutout reveals; each clip must be ≥ transition duration):

StyleDefault (s)Max (s)Description
circle_11.01.5Circle matte reveal
circle_21.01.0Circle matte variant
line_11.01.5Line matte sweep
line_21.01.8Line matte variant
line_31.01.0Line matte variant
hexagon1.01.4Hexagon grid reveal
square_11.02.4Square matte reveal
square_21.01.2Square matte variant
square_31.52.5Square matte variant
ink_11.01.6Ink splatter reveal
ink_21.01.4Ink splatter variant
paint_11.01.4Paint brush reveal
paint_21.02.0Paint brush variant
sea1.01.6Wave reveal
swirl1.01.5Swirl reveal
zebra1.53.1Zebra stripe reveal
memory1.52.1Dreamy film reveal
lens1.52.5Lens flare burst
glitch1.51.8Digital glitch distortion

denoise

Ask:

Ready to denoise. What level would you like?

LevelDescriptionBest for
lowSubtle reduction, preserves natural soundLight background hiss
moderate (default)Balanced, works well for most recordingsIndoor recordings, mild noise
highStrong reduction, may slightly affect voiceOutdoor recordings, fan noise
veryHighAggressive, prioritises noise removal over naturalnessVery noisy environments
customSet a specific value manually (you provide a number)Fine-tuned control
OptionDescriptionDefault
High-pass filterRemoves low-frequency rumble, e.g. air conditioning, engine humOff
Audio onlyIf input is a video, output denoised audio only (not video)Off

cutout-image

Ask:

Ready to remove the background. Which quality level would you like?

LevelDescriptionBest for
accurate (default)Highest quality, slowestFinal output, complex edges
balancedGood quality, moderate speedMost use cases
fastFastest, lower qualityPreviews, quick results

Export mask (optional) — also export the portrait segmentation mask as a separate PNG? (default: off)

Output will be a PNG with transparent background.

Note: If quality intent is ambiguous (e.g. just "quick" or "fast cutout"), ask: "Should I use accurate (best quality), balanced, or fast for the cutout?"


cutout-video

Before proceeding, check if the Mac is Apple Silicon:

uname -m
  • If output is arm64 → proceed
  • If output is x86_64 (Intel Mac) → stop and tell the user:

    cutout-video is not supported on Intel Macs. This feature requires Apple Silicon (M1 or later).

Ask:

Ready to remove the background from this video. Would you like any of these options, or shall I go ahead with defaults?

OptionDescriptionDefault
FeatherSoften cutout edges (0–100)0 (hard edge)
ExpandExpand (+) or shrink (-) mask edge (-20–20)0
Stroke styleAdd an outline: solid, singleLayer, glowing, solidGlowing, singleLayerGlowing, shadow, offsetNone
Stroke sizeStroke thickness (0–100)30
Stroke distanceDistance from edge to stroke (0–100)20
Stroke colorRGBA hex, e.g. FFFFFFFF for whiteFFFFFFFF
Stroke opacity0.0–1.01.0

Output will be an MP4 with the person composited on a black background.

Clarification rules:

  • If user says "add a stroke" without specifying style → ask: "Which stroke style? solid, glowing, shadow, or another?"
  • If user gives a color without format (e.g. "red stroke") → convert to RGBA hex (FF0000FF)
  • If user specifies glowing style, max stroke size is 60
  • offset style: size and distance range is -50..50

Keep draft — keep the VN draft after export for re-editing? (default: on; pass --no-keep-draft only if the user explicitly asks to delete it)


extract-audio

Ask:

Ready to extract audio. No required options — shall I go ahead with the default settings?

The audio will be exported as M4A format from the original audio track.


extract-frame

Ask:

Which frame would you like to extract?

OptionChoicesDefault
Positionfirst — very first frame / last — very last frame / custom — specify a timefirst
TimeTime in seconds, e.g. 5.5 (only needed when position is custom)
Formatpng — lossless, larger file / jpeg — smaller file, slight quality losspng
Max sizeResize so the longest edge fits within this many pixels, e.g. 1920original size

1. Agent Behavior (follow this strictly)

1.1 Startup checks (run once per session before any task)

Run these checks in order. Stop and handle any failure before continuing.

Step 0 — Check macOS version:

sw_vers -productVersion
  • If macOS version is below 13.0 → stop and tell the user:

    This skill requires macOS 13.0 or later. VN Video Editor is not supported on your current macOS version. Do not proceed.

  • If macOS 13.0 or later → continue to Step 1

Step 1 — Check VN is installed:

test -d /Applications/VN.app && echo "installed" || echo "not installed"

Step 2 — Check VN version:

The CLI automatically emits a version-check JSON event during startup of any command. You do not need to run --version separately — read the event when running the first real command.

Minimum required: VN 0.22

Example event:

{"type":"version-check","message":"VN Video Editor 0.22 (654) — OK"}

Step 3 — Check CLI is installed and up to date:

test -x ~/.openclaw/tools/vnapp-cli/vnapp-cli && echo "ok" || echo "missing"
  • If missing → follow § 2C CLI missing or outdated (auto-install silently)
  • If ok → check version:
    ~/.openclaw/tools/vnapp-cli/vnapp-cli --version
    
    Expected: vnapp-cli 0.1.0 (5)

Step 4 — Validate input file(s):

test -f "/absolute/path/to/file" && echo "exists" || echo "missing"
  • If missing → tell the user the file was not found and ask for a valid path
  • For add-caption: also validate the SRT file path
  • For concat-video: validate all input files

1.2 Running a task

  1. Resolve the correct command from § 1.5 Intent mapping
  2. Build the command using absolute paths — never relative paths
  3. Always append --stream
  4. For commands that support drafts (auto-captions, add-caption, compress-video, concat-video): always append --keep-draft unless the user explicitly asks to delete the draft. Do NOT add --keep-draft to extract-audio, extract-frame, compress-image, denoise, cutout-image, or cutout-video — these commands do not support it.
  5. For add-caption — ask the user for the SRT file path if not provided; both the video and SRT will be copied to the VN sandbox automatically
  6. For denoise --level custom--custom-value is required; ask the user if not provided
  7. For compress-video — follow § 1.7 compress-video pre-flight before running
  8. Run the command and parse JSON lines as they arrive
  9. Report progress to the user per § 1.3 Progress reporting
  10. On completed + outputPath → move the output to the same directory as the input, rename per § 1.4 Output file handling, then tell the user the final local path
  11. On error → report in plain language and follow § 2 Failure handling
  12. After completing any task → follow § 1.8 Preview delivery if the user is on a remote channel

1.3 Progress reporting

Map CLI events to user-facing messages:

CLI event / messageMessage to user
discovering🔍 Looking for VN Video Editor...
connected✅ Connected, preparing...
copying📂 Copying file to VN sandbox...
started⚙️ Task started (jobId if available)
progress message = Preparing auto captions...⚙️ Preparing...
progress message contains Downloading⏬ Downloading model... N% (extract % from message if present, report every ~10%)
progress message contains Model downloaded✅ Model ready
progress message contains Creating temporary project📂 Preparing project...
progress message contains Starting speech recognition🎤 Starting speech recognition...
progress message matches Speech recognition (N%)🎤 Recognizing speech... N%
progress message contains Adding captions✏️ Adding captions to timeline...
progress message contains Preparing export📦 Preparing export...
progress message matches Exporting video... N%🔄 Exporting... N% (report every ~10%)
progress message contains Finalizing🔄 Almost done...
completed✅ Done!
error❌ Error: {message}

Download detection rule: If message contains Downloading, the model is being downloaded for the first time. If Starting speech recognition appears without any prior Downloading message, the model is already cached — do not treat it as a download issue.

Throttle rule: For download percentage and export percentage, only report to the user every ~10% increment. Do not report every JSON tick. | file-copied | 📂 File copied, starting task... | | completed | ✅ Done! Output: {outputPath} | | error | ❌ Error: {message} |

Rules:

  • Report at meaningful intervals (~every 10%). Do not report every JSON tick.
  • Skip heartbeat events entirely.
  • Capture jobId from the started event — use it if the user asks to cancel.
  • Treat completed + outputPath as the only success condition.
  • Treat error or non-zero exit as failure.

1.4 Output file handling

After a task completes:

  • Move the output to the same directory as the input file
  • Rename using the original filename + a suffix:
CommandSuffix
extract-audio_audio
extract-frame_frame
auto-captions_captioned
add-caption_captioned
compress-video_compressed
compress-image_compressed
denoise_denoised
concat-video_merged
cutout-image_cutout
cutout-video_cutout

cutout-video outputs MP4 (person composited on black background). Requires Apple Silicon (M1 or later) — not supported on Intel Macs. cutout-image outputs PNG (transparent background). compress-video-estimate does not produce an output file — result is JSON printed to stdout only.

Example: /path/to/clip.mp4/path/to/clip_captioned.mp4


1.7 compress-video pre-flight

Always run compress-video-estimate before compress-video. This command runs entirely locally (no VN connection needed) and returns the original video metadata plus estimated output size.

Flow:

Step 1 — Run estimate

~/.openclaw/tools/vnapp-cli/vnapp-cli compress-video-estimate "/path/to/video.mp4" [-r <res>] [--fps <fps>] [-b <kbps>] [--hdr]

Output:

{
  "type": "estimate",
  "original": {
    "fileSize": 256901120,
    "width": 1920, "height": 1080,
    "fps": 30, "bitrateKbps": 8500,
    "duration": 62.5, "codec": "avc1"
  },
  "estimated": {
    "width": 1280, "height": 720,
    "fps": 30, "bitrateKbps": 5000,
    "hdr": false, "fileSizeBytes": 39062500
  }
}

Step 2 — Check if estimated < original

  • If estimated.fileSizeBytes < original.fileSize → go to Step 3

  • If estimated.fileSizeBytesoriginal.fileSize → auto-retry with adjusted parameters (max 2 retries):

    User did NOT specify resolution or bitrate:

    • Retry 1: drop one resolution tier (source → 480p)
    • Retry 2: drop another tier (480p → 360p)
    • Run compress-video-estimate again with new params each retry

    User DID specify resolution:

    • Keep the user's resolution, reduce bitrate by ~50% each retry
    • Do not change resolution without user consent

    After 2 retries still ≥ original: present numbered options to the user:

    I tried several configurations but couldn't estimate a smaller output. Here are some options:

    Option 1 — 360p, ~1500 kbps → estimated ~X MB Option 2 — 240p, ~800 kbps → estimated ~X MB

    Reply with the option number to proceed, or cancel to stop.

    Wait for the user's reply before proceeding.

Step 3 — Show estimate to user and proceed

Display before/after:

Original:   1920×1080  30fps  ~245 MB
Estimated:  1280×720   30fps  ~37 MB  (↑85%)

Then run compress-video with the confirmed parameters.

Step 4 — After export, show actual result

Original:   1920×1080  30fps  ~245 MB
Compressed: 1280×720   30fps  ~37 MB  (↑85%)

1.8 Preview delivery

Applies when the user is communicating via a remote channel (Slack, Discord, Telegram, WhatsApp, etc.) — not webchat or TUI.

After any task completes:

  1. Always tell the user the full local output path first:

    📁 Saved locally: /path/to/output_file.ext

  2. Check if the output is a video or image and attempt to send a preview:

For images:

  • Run compress-image on the output with -q 0.7 targeting a small file
  • Send the compressed image as preview
  • Tell the user: 🖼 Preview (compressed). Saved locally: /path/to/file

For videos:

  • Check the platform file size limit (look it up at runtime if uncertain; common limits: Slack ~1 GB, Discord free ~10 MB, Telegram ~2 GB, WhatsApp ~200 MB)
  • Get the output file size:
    ls -l "/path/to/output.mp4" | awk '{print $5}'
    
  • If already within the platform limit → send directly, label it as preview
  • If over the limit → compress a preview copy using compress-video, trying resolutions in order: 480p → 360p → 240p, checking size after each step until it fits
  • Send the preview and tell the user:

    🎬 This is a preview — a compressed version for viewing here (480p, 8.2 MB). 📁 Saved locally: /path/to/original_output.mp4 💡 The draft is saved locally in VN — open VN Video Editor there to further adjust. (only for commands that support drafts)

  • If 240p is still over the platform limit:

    ❌ The preview is too large to send on this platform even at the lowest quality. 📁 Saved locally: /path/to/original_output.mp4

Preview files are temporary — do not move them to the input directory. Clean up after sending.


1.5 Intent mapping

User saysCommand
extract audio / get audio / audio only / rip audio / pull audioextract-audio
grab frame / screenshot / cover image / thumbnail / still / posterextract-frame
auto captions / auto subtitles / generate subtitles / transcribe / speech to textauto-captions
add subtitles / burn SRT / import SRT / burn captions / overlay subtitlesadd-caption
compress image / shrink image / reduce image size / optimize imagecompress-image
compress video / shrink video / export smaller / reduce file size / re-encodecompress-video
join clips / merge videos / concatenate / combine videos / stitch togetherconcat-video
denoise / remove noise / remove background noise / clean audio / noise reductiondenoise
remove background / cutout / portrait cutout / background removal / transparent backgroundcutout-image (image) or cutout-video (video, Apple Silicon only)
estimate video compression / how big will this be / preview compression sizecompress-video-estimate
cancel job / stop task / abortcancel

1.6 Command quick reference

CLI binary: ~/.openclaw/tools/vnapp-cli/vnapp-cli

CommandRequired argsKey options
extract-audio<video>-o <dir>
extract-frame<video>-p first|last|custom -t <sec> -f png|jpeg --max-size <px> -o <dir>
auto-captions<video>-e whisper_turbo|whisper_medium|whisper_base|whisper_tiny -l <lang> --font-family <name> --font-size <pt> --text-color <hex> --stroke-color <hex> --stroke-width <pt> --keep-draft
add-caption<video> --srt <srt>--font-family <name> --font-size <pt> --text-color <hex> --stroke-color <hex> --stroke-width <pt> --background-color <hex> --opacity <0-1> --keep-draft -o <dir>
compress-video<video>-r <resolution min:144> --fps <1-120> -b <kbps min:100> --hdr --keep-draft -o <dir>
compress-video-estimate<video>-r <resolution min:144> --fps <1-120> -b <kbps min:100> --hdr (local only, no VN connection needed)
compress-image<image>-f jpeg|png|webp|heic -q <0-1> -w <px> -h <px> --no-keep-aspect-ratio -o <dir>
concat-video<video1> <video2> [...]-s <transition> -d <0.2-5.0 sec> --keep-draft -o <dir>
denoise<audio|video>-l low|moderate|high|veryHigh|custom -v <int> (required for custom) --high-pass --audio-only -o <dir>
cutout-image<image>-q accurate|balanced|fast --export-mask -o <dir>
cutout-video<video>--feather <0-100> --expand <-20..20> --stroke-style solid|singleLayer|glowing|solidGlowing|singleLayerGlowing|shadow|offset --stroke-size <0-100> --stroke-distance <0-100> --stroke-color <RRGGBBAA> --stroke-opacity <0-1> --keep-draft -o <dir>
get-status<job-id>
cancel<job-id>

Always append --stream to all task commands (except compress-video-estimate — it is local-only, no --stream needed). For auto-captions, add-caption, compress-video, concat-video, cutout-video: always append --keep-draft unless the user explicitly asks to delete the draft. Do NOT add --keep-draft to extract-audio, extract-frame, compress-image, denoise, cutout-image, or compress-video-estimate.

Caption engines:

  • whisper_turbo (default) and whisper_medium — require a one-time download on first use; cached and reused afterwards
  • whisper_base and whisper_tiny — built-in, no download needed

2. Failure Handling

2A. VN not installed

  1. Inform the user:

    VN Video Editor is required but not installed on the Mac running OpenClaw. I've opened the App Store — please click Get to install. Once installed, you also need to enable the MCP Server in VN Settings before I can connect.

  2. Open App Store:

    open "macappstore://apps.apple.com/us/app/vn-video-editor/id1494451650"
    
  3. Poll every 5 seconds for up to 3 minutes:

    test -d /Applications/VN.app && echo "installed" || echo "not yet"
    
  4. On detection → inform the user:

    VN is installed. Before I can connect, please complete these steps on the Mac running OpenClaw:

    1. Open VN Video Editor
    2. Go to Settings → MCP Server and turn it on
    3. When macOS prompts for Local Network permission, tap Allow Let me know when done and I'll continue.
  5. After 3 minutes → stop polling and ask the user to confirm manually


2B. VN too old

  1. Inform the user:

    VN Video Editor is installed but the version is too old (minimum: 0.22 build 654). Please update from the App Store.

  2. Open App Store:

    open "macappstore://apps.apple.com/us/app/vn-video-editor/id1494451650"
    
  3. Wait for the user to confirm, then retry


2C. CLI missing or outdated

Auto-install silently — do not ask the user:

Step 1 — Download:

mkdir -p ~/.openclaw/tools/vnapp-cli
curl -L "https://github.com/cawcut/skill-vn/releases/download/0.1.0/vnapp-cli-darwin-universal-v0.1.0.5.zip" -o /tmp/vnapp-cli.zip

Step 2 — Verify SHA-256 before extracting:

shasum -a 256 /tmp/vnapp-cli.zip

Expected: 857417f671fd6f38ec30cf993f80c79efede7641f3532c2e9a99e75177a5853f

  • If checksum matches → proceed to Step 3
  • If checksum does not match → stop immediately, delete the file, and tell the user:

    ⚠️ SHA-256 checksum mismatch — the downloaded file may be corrupted or tampered with. Installation aborted. Please try again or contact support.

    rm /tmp/vnapp-cli.zip
    

Step 3 — Extract and install:

unzip -o /tmp/vnapp-cli.zip -d ~/.openclaw/tools/vnapp-cli/
chmod +x ~/.openclaw/tools/vnapp-cli/vnapp-cli
xattr -d com.apple.quarantine ~/.openclaw/tools/vnapp-cli/vnapp-cli 2>/dev/null

Verify with --version. If it still fails:

The vnapp-cli tool could not be installed automatically. Please reinstall this skill.


2D. Local Network permission denied

VN requires Local Network permission to communicate with vnapp-cli.

Tell the user:

Please check Local Network permission on the Mac running OpenClaw: go to System Settings → Privacy & Security → Local Network, make sure VN - Video Editor is enabled, then relaunch VN and try again.

If the user has never been prompted, ask them to open VN manually and perform any action — macOS will show the permission dialog.


2J. Cannot connect to VN (generic connection failure)

If the CLI cannot reach VN (service not found, connection refused, timeout, or similar), do not silently retry. Tell the user clearly that the connection failed, and walk them through the following checklist — reminding them to check on the Mac running OpenClaw (they may be talking to you from a different device):

I couldn't connect to VN Video Editor on the Mac running OpenClaw. Please check the following on the Mac running OpenClaw:

  1. Is VN Video Editor installed? Check that VN Video Editor exists in your Applications folder. It may have been deleted. If missing, reinstall from the App Store.

  2. Is the VN version supported? The minimum required version is VN 0.22. Open VN, go to About or check the App Store for updates.

  3. Is VN Video Editor open? Open VN Video Editor from your Applications folder if it's not already running.

  4. Is the MCP Server enabled in VN? In VN, go to Settings → MCP Server and make sure the toggle is on.

  5. Does VN have Local Network permission? Go to System Settings → Privacy & Security → Local Network and confirm VN - Video Editor is enabled.

Once all items are confirmed, let me know and I'll retry.


2E. Codec / decode error

If the error contains "cannot decode", "codec not supported", or similar decode failures:

VN could not decode this file. Please check that the file is not corrupted and is a supported format (MP4, MOV, common video formats including HEVC/H.265). Try re-exporting or re-downloading the file and try again.


2F. Whisper model download failed

Whisper engines:

  • whisper_turbo and whisper_medium — require a one-time download on first use; cached afterwards and do not need to be downloaded again.
  • whisper_base and whisper_tiny — built-in, no download needed.

Whisper model download progress events (normal flow):

When a model needs to be downloaded, the CLI emits these messages in sequence:

{"status":"processing","progress":0.01,"message":"Downloading whisper_medium model..."}
{"status":"processing","progress":0.01,"message":"Downloading whisper_medium model..."}
...
{"status":"processing","progress":0.05,"message":"Model downloaded successfully"}
{"status":"processing","progress":0.06,"message":"Creating temporary project..."}
{"status":"processing","progress":0.10,"message":"Starting speech recognition..."}

If Starting speech recognition... appears without any prior Downloading messages, the model is already cached locally. Do NOT treat this as a download issue.

Only switch engines when the CLI emits an explicit download failure error event. Do NOT switch because the task is taking a long time — whisper_medium and whisper_turbo are slower by design.

If the user explicitly specified an engine (e.g. whisper_medium): do NOT silently switch. Instead, inform the user and ask for their preference:

The whisper_medium model failed to download. Would you like to: Option 1 — Retry downloading whisper_medium Option 2 — Switch to the built-in whisper_base (faster, slightly less accurate) Option 3 — Cancel

Wait for the user's choice before proceeding.

If the user did NOT specify an engine (using the default whisper_turbo): automatically fall back to whisper_base and inform the user:

~/.openclaw/tools/vnapp-cli/vnapp-cli auto-captions "<path>" -e whisper_base --stream

The Whisper model failed to download. Switched to the built-in whisper_base engine — slightly less accurate but works without downloading anything.


2G. SRT file missing (add-caption)

If the user requests add-caption without providing an SRT file:

Please provide the path to your SRT subtitle file.

Validate the path exists before running the command.


2H. Invalid parameter values

The CLI enforces these constraints — validate before running:

ParameterConstraint
compress-video -rminimum 144
compress-video --fps1–120
compress-video -bminimum 100 kbps
concat-video -d0.2–5.0 seconds
denoise -l customrequires -v <int>
compress-image -q0.0–1.0

If a value is out of range, tell the user the valid range and ask for a corrected value before running.


2I. Job failed (generic)

Summarise the message field from the error event in plain language. Suggest a next step based on the error content. If unresolvable, refer the user to support (§ 3).


3. Response Format

When a task completes, always reply with:

  1. Result — one line confirming success or failure
  2. Output file path — always state the full local path:

    📁 Saved locally: /path/to/output_file.ext

  3. Key info — only if relevant (e.g. engine used for captions, resolution for compress, denoise level)
  4. VN tip — only append after a successful task that supports drafts (auto-captions, add-caption, compress-video, concat-video, cutout-video):

    💡 The draft is saved locally in VN — open VN Video Editor there to further adjust the result. Do NOT show this tip for extract-audio, extract-frame, compress-image, denoise, or cutout-image.

Do not repeat the full command or internal job details unless the user asks.

Example — auto-captions:

✅ Captions added successfully. 📁 Saved locally: /path/to/clip_captioned.mp4 Engine: whisper_base 💡 The draft is saved locally in VN — open VN Video Editor there to further adjust the result.

Example — compress video:

✅ Video compressed. Original: 1920×1080 30fps ~245 MB Compressed: 1280×720 30fps ~82 MB (↑66%) 📁 Saved locally: /path/to/clip_compressed.mp4 💡 The draft is saved locally in VN — open VN Video Editor there to further adjust.

Example — compress video (cannot reduce size):

❌ Unable to compress this video further — all configurations produced an estimated output larger than or equal to the original. The source may already be highly compressed. Try specifying a lower resolution or bitrate manually.

Example — cutout-image:

✅ Background removed successfully. 📁 Saved locally: /path/to/portrait_cutout.png Quality: accurate

Example — cutout-video:

✅ Background removed from video. 📁 Saved locally: /path/to/person_cutout.mp4 Format: MP4 (person composited on black background) 💡 The draft is saved locally in VN — open VN Video Editor there to further adjust the result.

Example — preview sent (remote channel):

🎬 This is a preview — a compressed version for viewing here (480p, 8.2 MB). 📁 Saved locally: /path/to/original_output.mp4 💡 The draft is saved locally in VN — open VN Video Editor there to further adjust.

Example — preview too large:

❌ The preview is too large to send on this platform even at the lowest quality. 📁 Saved locally: /path/to/original_output.mp4

Example — error:

❌ Failed: VN could not decode this file. Please check the file is not corrupted and try again.


4. Support

If the problem persists, contact VN support at vn.support+mac@ui.com.


5. Reference

For the full CLI surface, all options, transition style names, and JSON output schema:

references/cli-reference.md


6. Notes

  • macOS only — does not work on other platforms
  • All processing runs inside VN Video Editor; vnapp-cli is only the local bridge
  • Always use absolute file paths
  • Always use --stream for user-facing operations
  • Only report success when completed event contains a valid outputPath
  • jobId is available after started — use it for get-status or cancel if needed

Comments

Loading comments...