OpenClaw AI Video Editor

Agentic AI video editor — autonomous agent that plans and executes natural-language video edits end-to-end: viral clips, auto captions, vertical 9:16 reframe, chroma key, silence removal, motion tracking, B-roll, voiceover, music generation, and MP4 / multi-platform export through a single agentic tool.

Audits

Pass

Install

openclaw plugins install clawhub:openclaw-ai-video-editor

OpenClaw AI Video Editor

An autonomous agentic video editor. Send a sentence. Get a finished video. No timelines, no keyframes, no plugin chains.

ClawHub npm

Beta. The agent can make mistakes. Preview every output before publishing or sharing. For high-stakes or irreversible edits, pass requirePlanApproval: true so the agent stops after planning and waits for your approval before anything runs.


Beyond one-liners: goals and briefs

autonomous_edit accepts anything from a five-word command to a five-hundred-word creative brief. Three flavors:

1. Direct commands

Single edit, single result:

"Generate captions and remove silences"
"Reframe to 9:16 for TikTok"
"Color grade like a Netflix doc"

2. Open-ended goals (watch and propose)

Hand the agent a goal instead of a command and it inspects the asset, then proposes a plan:

"Watch this and tell me how to make it more engaging for TikTok"
"Look at the first 30 seconds and suggest 3 ways to hook the viewer"
"Review this footage and propose edits to tighten the pacing"

Pair these with requirePlanApproval: true to keep the agent in propose-only mode — it stops after planning, returns the full plan, and waits for your approval before executing anything.

3. Full creative briefs

Multi-step orchestrations executed end-to-end. The agent decomposes the brief into a DAG of canonical actions, plans the order, executes through safety gates, verifies each step, and exports every requested format. Use SSE to watch each step land in real time.

Transform this video into a viral social media documentary. Start by
trimming the first 3 seconds and last 5 seconds to remove dead air.
Apply a cinematic color grade with warm tones and high contrast.
Generate auto-captions with bold yellow text and black outline,
positioned at the bottom center. Add a dynamic zoom-in effect on the
main subject during the first 10 seconds. Insert 3 contextual b-roll
clips at moments where the speaker mentions visual concepts. Apply
face tracking to keep the subject centered when converting to vertical
9:16 format for TikTok and Instagram Reels. Add an upbeat background
music track that matches the energy, with auto-ducking when the
speaker talks. Remove all silence gaps longer than 0.5 seconds to
tighten pacing. Apply a subtle vignette effect around the edges. Add
my logo watermark in the top right corner with 80% opacity. Generate
a custom thumbnail with the most expressive frame and bold text
overlay saying MUST WATCH. Create an animated subscribe button that
appears at 5 seconds and pulses. Add sound effects for emphasis at
key moments — use whoosh sounds for transitions. Apply noise reduction
to clean up the audio. Split the video at 30 seconds to insert a
2-second transition effect. Add text overlays for key statistics
mentioned in the video with animated counter effects. Apply motion
tracking to blur any faces in the background for privacy. Export the
final video in 1080p for YouTube and also create optimized versions
for TikTok, Instagram Reels, and YouTube Shorts with platform-specific
aspect ratios and durations.

That single call composes ~20 canonical actions in one planned run. No timeline UI, no per-step API calls, no glue code on your side.


What you can do with it

You askIt does
"Turn this into 5 viral clips with captions and vertical reframe"Picks the best moments, cuts, captions, reframes, and exports
"Cut a 60-second highlight from this 2-hour podcast"Identifies narrative peaks, trims, packages with captions
"Make this TikTok-ready"Vertical 9:16 reframe + captions + silence removal + emphasis kit
"Export for TikTok, Reels, Shorts, YouTube, and Instagram"One pass, all aspect ratios in one bundle
"Replace the green screen with a beach, keep the speaker centered"Chroma key + background composite + motion tracking in one call
"Remove the background from this video — no green screen"AI background removal via alpha matte (works on any footage)
"Blur the background, keep me in focus"AI background blur (subject-aware)
"Remove the logo from this frame"Object removal (segmentation + inpainting)
"Remove all silences and filler words, add background music"Cleans the audio track, adds ducked music under speech
"Auto-zoom on whoever's talking"Active-speaker detection + dynamic zoom-follow framing
"Generate captions and highlight every time they say 'launch'"Auto-captions + keyword emphasis with scaling / glow / pulse
"Find every clip where Alex appears"Cross-asset face identity search
"Dub this video in Spanish"AI dubbing — translate transcript + synthesize voiceover in target language
"Translate the captions to French"Caption translation (30+ Latin-script languages, timing preserved)
"Stabilize this shaky handheld footage"Video stabilization
"Reverse this clip"Playback reversal
"Add a title card before the intro"Auto-generated title slate
"Bleep out the swearing"Profanity detection + auto-mute
"Crossfade between these two clips"Native transitions (fade, slide, wipe, iris, zoom, whip pan, cross dissolve)
"Build a slideshow from these photos with music"AI slideshow generation
"Replace the background with a beach — no green screen"Semantic background replacement
"Generate a thumbnail for this video"AI thumbnail extraction at the best frame
"Add narration in a cloned voice over the intro"Voice cloning + TTS overlay + auto-ducking
"Cut to the beat of this track"Provide BPM or beat times — cuts align automatically
"Generate B-roll over the product mention"AI B-roll generation + placement at the right timestamp
"Blur all faces except the speaker"Face detection + selective blur
"Add my logo as a watermark in the corner"Branding overlay from gallery or AI-generated
"Color grade this like a Netflix doc"Color grading with cinematic preset
"Slow-mo the climax, freeze on the reveal"Retime with ramp curves + freeze-frame
"Generate chapter titles from the on-screen text"OCR-driven chapter generation
"Reframe to vertical but don't crop off the lower-third captions"Caption-safe 9:16 reframe (preserves on-screen text)
"What text is on screen at 0:42?"On-screen text extraction (OCR)
"Sync my voiceover to the video and duck the music"Audio sync + auto-ducking

That's the headline. Below is the full surface.

Beta caveat: outputs sometimes need a second pass. Preview before publishing; for anything you can't undo, approve the plan first using the plan approval flow.


Made for

  • Creators & influencers — turn long videos into TikToks, Reels, Shorts; auto-captions; viral-clip generation; AI thumbnails
  • Podcasters — video-podcast highlights, audiograms, multi-cam podcast editing, silence and filler-word removal
  • Marketers & agencies — ad creation at scale, social-first repurposing, brand-overlay automation, multi-platform exports
  • Educators & course creators — tutorial editing, auto-chapters, OCR'd slide titles, auto-captions, word-level emphasis
  • YouTubers — long-form → Shorts pipeline, thumbnail generation, intro / outro automation, chapter markers
  • Sales & SaaS — product demos, walkthrough highlights, voiceover narration with cloned voices
  • Event teams — webinar / conference highlights, multi-cam stitching, speaker spotlight reels
  • Faith & community — sermon clips for social, message highlights, multi-platform packaging

Capability surface

Read & inspect

  • Inspect the timeline structure, layer properties, and scene state
  • Search the asset gallery by type, duration, name
  • Run computer-vision analysis on frames (object, face, scene)
  • Extract on-screen text — OCR of chyrons, lower-thirds, slide titles, captions burned into source
  • Search the transcript by keyword, semantic meaning, or timestamp window
  • Pull video intelligence — narrative peaks, speaker diarization, sentiment, pacing
  • Cross-asset face identity search — find every clip a specific person appears in
  • Poll async job status, introspect tool / schema metadata

Structural editing

  • Insert / update / replace / delete layers — video, audio, text, image, shape, group, adjustment, transition, lottie, waveform
  • Trim, split, retime — slow-motion (0.5×), fast-forward (2×), freeze-frame, ramp curves, reverse playback
  • Video stabilization — shake removal for handheld footage
  • Title cards — auto-generated title slates from the brief or transcript
  • Transitions — fade, slide, wipe, iris, zoom, blur fade, whip pan, zoom blur, cross dissolve
  • Markers — chapters, ad slots, censorship cues, highlight points
  • Reposition, sequence, snap to transcript word boundaries
  • Heal timeline gaps, normalize audio, reconcile durations (pre-export safety pass)
  • Multi-step undo / redo

Visual editing

  • Color grading — brightness, contrast, saturation, hue, lift / gamma / gain, RGB curves
  • Color preset library — cinematic, vintage, monochrome, teal / orange, sunset, moonlight, warm, cool, desaturated, high-contrast
  • Blend modes — 17 standard (multiply, screen, overlay, soft / hard light, color dodge / burn, difference, hue, saturation, luminosity, etc.)
  • Procedural VFX shaders — smoke, dust, steam, mist, fog, fire, explosion, lightning, lava, snow, shimmer, glitch, scanlines, CRT, VHS, grain, glassmorphism, bokeh, telomere / corrosion, portal
  • Video noise reduction — separate from audio denoise
  • Chroma key — green / blue screen with similarity, smoothness, spill suppression
  • AI background removal — alpha matte (works on any video, no green screen required)
  • Background replacement — swap with any image / video, no green screen required
  • AI background blur — subject-aware depth blur
  • Object removal — AI segmentation + temporally consistent video inpainting
  • Masking — luma, alpha, depth, chroma (with inverted variants)
  • Geometric clip shapes — rectangle, circle, ellipse, semi-circle / dome, triangle, pentagon, hexagon, octagon, star, line, polygon, custom path
  • Crop — absolute or edge-based; 3D rotation + perspective
  • Glow, shadow, inner shadow, gradient fills, text gradients
  • Vertical reframe (9:16) and vertical-reframe montage — caption-safe (avoids cropping through on-screen text and chyrons)
  • Split screen — top/bottom, left/right, picture-in-picture, grid
  • Branding overlays — logo / watermark from gallery or AI-generated
  • Motion / face tracking with dynamic zoom-follow framing

Captions & text

  • Auto-generate captions from the transcript
  • Translate captions to 30+ Latin-script languages (Spanish, French, German, Portuguese, Italian, Dutch, Polish, Turkish, etc.) — timing and word boundaries preserved; non-Latin scripts (CJK, Arabic, Devanagari) coming soon
  • Style captions with built-in templates or an AI director that picks / generates a custom template at runtime
  • Curved text paths — circle, wave, custom SVG
  • Per-word animations — typewriter, slide (up / down / left / right), fade, dissolve, scale, zoom, rotate, bounce, flip, swing, elastic, blur, glitch, wave, crumble, implode, scatter, pixelate (each with matching exit)
  • Lottie animation playback control

Audio

  • Clean audio — remove silences, breaths, filler words; word-level mute / cut
  • Profanity cleanup — auto-detect and mute / bleep flagged words
  • Auto-ducking on speech detection (sidechain music vs voice)
  • Mix / normalize / denoise / EQ (bass-boost, vocal-clarity, warm, bright presets)
  • Audio crossfade between clips
  • Sync external master audio to video (offset, mute camera audio)
  • Beat-synced cuts — provide beat_times or bpm, the editor aligns cuts
  • Add SFX
  • Generate music (mood / genre / BPM)
  • Generate voiceover (TTS or cloned voice)
  • AI dubbing — auto-read transcript, translate to 30+ Latin-script languages, synthesize voiceover; speaking rate auto-calibrated to match original video duration
  • Render waveform visualizers (bars, wave, circular)

Async generation

  • AI video / B-roll — duration + aspect ratio
  • AI images — single or batch at timestamps
  • AI music — prompt + duration + mood + genre + BPM
  • AI voiceover — TTS or cloned voice library
  • AI slideshow — image set + audio → animated slideshow
  • Auto-thumbnail extraction
  • Face blur — all faces or background-only
  • Image edit — generative instruction-based

High-level presets

Each preset bundles many underlying edits into a single call:

PresetWhat it does
Viral presetVertical reframe + captions + silence removal + motion tracking + word emphasis
Cinematic directorDynamic zooms + cinematic color grade + mood-based camera moves
Emphasis systemKeyword detection + coordinated scaling / glow / pulse with captions
Pacing optimizerFiller-word + silence + low-energy segment removal for retention

Export

CapabilityOutput
MP4 exportSingle MP4 (resolution / codec / quality tier)
Viral clip batchAuto-segmented short-form clips packaged as ZIP
Multi-platform packTikTok + Reels + Shorts + YouTube + Instagram aspect ratios in one pass

How you use it

Every edit goes through one tool: autonomous_edit. Pass a natural-language description of what you want and the agent plans, executes, verifies, and exports — no tool list to memorize, no structured params to learn.

curl -sS -X POST "$ADSCENE_API_URL/api/v1/misc/openclaw/v1/execute" \
  -H "Authorization: Bearer $ADSCENE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "tool": "autonomous_edit",
    "params": {
      "prompt": "Make this a TikTok-ready viral clip: vertical reframe, add bold captions, remove silences, motion-track the speaker, and export."
    },
    "project_id": "my-project"
  }'

Anything in the capability surface above is reachable from a prompt. Mix and match: chroma key + background replace + caption + emphasis in a single sentence works.


Quick start

1. Get an API key

Sign up at https://studio.livecor.ai/ and generate an OpenClaw API key from the account UI.

2. Install the plugin

openclaw plugins install clawhub:openclaw-ai-video-editor

Or via npm:

npm install openclaw-ai-video-editor

3. Configure

export ADSCENE_API_URL="https://api.livecore.ai"
export ADSCENE_API_KEY="your-openclaw-api-key"

Or in OpenClaw skill config:

{
  "skills": {
    "entries": {
      "openclaw_ai_video_editor": {
        "enabled": true,
        "env": {
          "ADSCENE_API_URL": "https://api.livecore.ai",
          "ADSCENE_API_KEY": "your-openclaw-api-key"
        }
      }
    }
  }
}

4. Run an autonomous edit

curl -sS -X POST "$ADSCENE_API_URL/api/v1/misc/openclaw/v1/execute" \
  -H "Authorization: Bearer $ADSCENE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "tool": "autonomous_edit",
    "params": {
      "prompt": "Generate 5 viral clips, 15-30 seconds each, focused on the most engaging moments. Add bold captions, vertical reframe, remove silences."
    },
    "project_id": "my-project"
  }'

API surface

Base URL: {ADSCENE_API_URL} (production: https://api.livecore.ai)

EndpointPurpose
POST /api/v1/misc/openclaw/v1/executeRun autonomous_edit or a deterministic tool
GET /api/v1/misc/openclaw/v1/jobs/{jobId}Poll async render or generation jobs
GET /api/v1/misc/openclaw/v1/toolsList available tool names
GET /api/v1/misc/openclaw/v1/healthHealth check

Auth header:

Authorization: Bearer {ADSCENE_API_KEY}

Do not set ADSCENE_API_URL to https://studio.livecor.ai or the in-product editor route /api/v1/misc/editor/. Studio is the user-facing app; OpenClaw requests go to the API-key route on https://api.livecore.ai.

Request body

{
  "tool": "autonomous_edit",
  "params": { "prompt": "<natural-language edit>" },
  "project_id": "optional",
  "scene": { }
}

JSON response

{
  "type": "success" | "partial_success",
  "tool": "<tool-name>",
  "success": true,
  "status": "completed" | "failed" | "awaiting_approval",
  "scene": { },
  "reply": "Human-readable summary of what changed",
  "videoUrl": "https://.../output.mp4",
  "jobId": "task_...",
  "viral_clips": [ ],
  "zip_url": "https://.../clips.zip",
  "activeTasks": [ ],
  "pendingAsyncJobs": [ ],
  "verificationPassed": true,
  "verificationIssues": [],
  "processingTime": 12.3,
  "workingMemory": { }
}

SSE streaming

Set Accept: text/event-stream or ?stream=true. Notable event types:

EventMeaning
heartbeat15s keepalive
statusPhase transitions
thinking, tool_call, tool_resultPer-step reasoning visibility
background_job_completedAsync job done
success / partial_successTerminal payload
errorTerminal failure

Async job lifecycle

Generation actions return immediately with a jobId. Poll:

curl -sS "$ADSCENE_API_URL/api/v1/misc/openclaw/v1/jobs/$JOB_ID" \
  -H "Authorization: Bearer $ADSCENE_API_KEY"

Auto-export

After any mutating tool call, if the scene actually changed and an export isn't already queued, the route auto-fires one as a second run. Read-only and conversational autonomous_edit calls do not trigger auto-export.

Plan approval flow

Pass requirePlanApproval: true to make the agent stop after planning. It returns status: "awaiting_approval" + populated workingMemory. Resume by sending the same workingMemory with an approval prompt (yes, approve, go, do it, confirm).


How this differs from legacy editors

Legacy editors (Premiere, DaVinci, CapCut, Descript)OpenClaw AI Video Editor
InterfaceDrag, drop, keyframe by handOne sentence; the agent does the rest
Auto-analysis on uploadManual scene detection, manual subtitlesFaces, speakers, shots, on-screen text, mattes all detected automatically
"Make this viral"You build the workflowSingle preset — vertical + captions + silences + tracking, done
Cross-asset searchFilename search"Find every clip where Alex appears" — works across the whole library
Background replaceKey out manually, find background, compositeOne call
Beat-synced cutsMark beats by handProvide beat_times or bpm; cuts align automatically
Editorial reasoningYou decide what's the climaxAgent surfaces narrative peaks and reaction moments
VerificationYou eyeball the resultVerifier runs after execution; auto-repairs on failure
Multi-platform exportOne render per aspect ratioTikTok + Reels + Shorts + YouTube + Instagram in one pass
ExtensibilityPlugins call external binariesTyped tools called by name

This isn't an editor with AI features bolted on. It's an autonomous agent that finishes the edit for you.


Safety & limits

  • Currently in beta — outputs can be wrong. Preview every result before sharing or publishing. For irreversible workflows, pass requirePlanApproval: true to inspect and approve the plan before any edit runs.
  • API-key authentication on every request
  • Destructive actions (mass deletes, clears) require explicit confirmation params
  • Output is verified after execution; auto-repairs on failure
  • Concurrent identical requests are deduplicated server-side
  • Rate-limited per API key. Read-only ~1–3s, structural edits ~3–10s, async generation 30s–5min per artifact, viral-clip / multi-platform exports several minutes

Supported media

Formats
Video inMP4, MOV, WebM (HTTP/HTTPS URLs, YouTube URLs, gallery IDs)
Image inJPG, PNG, WebP
Audio inMP3, WAV, M4A, AAC (or extracted from video)
OutputMP4 (export), ZIP (viral clips / multi-platform bundles)
Max video lengthUp to 3 hours per asset (plan-dependent). Synchronous edits and async generation both supported.
Recommended resolution1080p or 4K; canvas is configurable per project

End-to-end example: viral-clip generation

# 1) Kick off the viral-clip pipeline
curl -sS -X POST "$ADSCENE_API_URL/api/v1/misc/openclaw/v1/execute" \
  -H "Authorization: Bearer $ADSCENE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "tool": "autonomous_edit",
    "params": {
      "prompt": "Generate 5 viral clips, 15-30 seconds each, focused on the most engaging moments. Add bold captions, vertical reframe, remove silences."
    },
    "project_id": "my-project"
  }' | tee /tmp/result.json | jq -r '.jobId // .activeTasks[0].intent.job_id'

# 2) Poll job status until done
JOB_ID=$(jq -r '.jobId // .activeTasks[0].intent.job_id' /tmp/result.json)
while true; do
  STATUS=$(curl -sS "$ADSCENE_API_URL/api/v1/misc/openclaw/v1/jobs/$JOB_ID" \
    -H "Authorization: Bearer $ADSCENE_API_KEY" | jq -r '.status')
  echo "Status: $STATUS"
  [ "$STATUS" = "completed" ] || [ "$STATUS" = "failed" ] && break
  sleep 5
done

# 3) Fetch the final artifact URLs
curl -sS "$ADSCENE_API_URL/api/v1/misc/openclaw/v1/jobs/$JOB_ID" \
  -H "Authorization: Bearer $ADSCENE_API_KEY" | jq '.result'

Links

Support

Setup help, integration questions, or issue reports — brajendrak00068@gmail.com

License

MIT