Install
openclaw skills install adscene-video-editorAgentic AI video editor. One free-form `autonomous_edit` tool decomposes natural-language intents into a planned sequence of canonical edits; a deterministic allowlist (`scene_update`, `caption_compose`, `clean_audio`, `export_video`, etc.) bypasses planning for structured calls. Supports captions, viral kits, chroma key, motion tracking, B-roll, voiceover, music, multi-platform exports, and more.
openclaw skills install adscene-video-editorA full agentic video editing surface. Send a prompt, get a planned, verified, executed edit — or call a known canonical action directly with structured params. Both modes flow through the same brain + safety gates as the in-product editor.
POST {ADSCENE_API_URL}/api/v1/misc/openclaw/v1/execute
Auth: Authorization: Bearer {ADSCENE_API_KEY}
Accepts either single-shot JSON (default) or SSE (Accept: text/event-stream or ?stream=true).
Request body:
{
"tool": "autonomous_edit" | "<allowlisted-tool>",
"params": { ... },
"project_id": "optional-project-id",
"scene": { /* optional client scene; server-side committed scene wins if newer */ }
}
autonomous_edit — free-form prompt (primary)Pass a natural-language description in params.prompt. The brain classifies the intent, decomposes into atomic steps, plans a DAG of canonical actions, executes through the safety gates, and verifies the result. Use this when the caller doesn't already know which low-level action(s) are needed.
curl -sS -X POST "$ADSCENE_API_URL/api/v1/misc/openclaw/v1/execute" \
-H "Authorization: Bearer $ADSCENE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"tool": "autonomous_edit",
"params": {
"prompt": "Make this a TikTok-ready viral clip: vertical reframe, add bold captions, remove silences, and apply motion tracking to the speaker."
},
"project_id": "my-project"
}'
Behind a single autonomous_edit call the agent can compose any of:
Read / inspect
Structural editing
Visual editing
Captions & text
Audio
beat_times or bpm)Async generation
High-level kits (each is a single canonical action that orchestrates many underlying edits)
APPLY_VIRAL_KIT — vertical reframe + captions + silence removal + motion tracking + emphasisAPPLY_CINEMATIC_DIRECTOR — energy analysis + dynamic zooms + cinematic color grade + mood-based camera movesAPPLY_EMPHASIS_SYSTEM — keyword detection + scaling / glow / pulse coordinated with captionsOPTIMIZE_PACING — filler-word + silence + low-energy segment removal for retentionExport
EXPORT_VIDEO — render to MP4 (resolution / codec / quality tier)GENERATE_VIRAL_CLIPS — auto-segment short-form clips packaged as ZIPGENERATE_MULTI_PLATFORM — TikTok + Reels + Shorts + YouTube + Instagram aspect ratios in one passWhen the caller already knows the action and has structured params, use one of these tool names. The request is dispatched directly to the canonical action (skips intent decomposition / planning entirely):
| Tool | Canonical action | Use when |
|---|---|---|
read_scene | READ_SCENE | Inspect the current timeline |
read_media | QUERY_ASSETS | List gallery assets |
read_visual | READ_VISUAL | Run CV analysis on frames |
query_transcript | QUERY_TRANSCRIPT | Search transcript by text or timestamp |
scene_update | UPDATE_LAYER | Mutate a known layer's properties |
scene_insert | CREATE_LAYER | Add a video / audio / text / image / shape layer |
scene_timing | SCENE_TIMING | Trim, retime, reposition a layer |
scene_mask | APPLY_MASK | Apply a chroma / luma / alpha / depth mask |
chroma_key | APPLY_MASK | Convenience for green-screen / blue-screen keying |
split_screen | SPLIT_SCREEN | Grid layout: top-bottom, left-right, PIP |
caption_compose | GENERATE_CAPTIONS | Generate captions from transcript |
media_treat | COLOR_GRADE | Apply color correction |
scene_track | TRACK_MOTION | Face / object tracking with zoom-follow |
clean_audio | CLEAN_AUDIO | Silence / breath / filler removal |
audio_mix, audio_mixing | AUDIO_MIXING | Ducking, normalize, denoise, EQ |
voiceover_add | GENERATE_VOICEOVER | Generate voiceover (text + voice_id) |
music_generate | GENERATE_MUSIC | Generate background music |
export_video | EXPORT_VIDEO | Render to MP4 |
Any unknown tool name returns 400 UNKNOWN_TOOL with a hint pointing at autonomous_edit.
Example — direct layer update, no planning round-trip:
curl -sS -X POST "$ADSCENE_API_URL/api/v1/misc/openclaw/v1/execute" \
-H "Authorization: Bearer $ADSCENE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"tool": "scene_update",
"params": { "layer_id": "layer_123", "opacity": 0.5, "rotation": 15 }
}'
After any mutating tool call (i.e., not read_*/query_* and not export_video itself), if the scene was actually changed and the agent did not already queue an EXPORT_VIDEO, the route fires one automatically as a second run. The final response carries videoUrl (when ready) or jobId (for polling). Read-only and conversational autonomous_edit calls do NOT trigger auto-export.
Pass any of these inside params (or at the top level of the body) to drive advanced features:
prompt — required for autonomous_edit; ignored for deterministic tools (structured params win)workingMemory — durable working-memory snapshot. Re-send to resume after awaiting_approvalrequirePlanApproval — true makes the agent stop after planning and emit awaiting_approval; resume with the same workingMemory + an approval prompt ("yes", "approve", "do it", …)attachedImages — array of base64 screenshots / reference imagesflaggedIssues — array of strings describing specific problems the user wants fixedcaptionTemplatePreset, captionTemplateMode — style preset routing for caption generationcore_only (also accepted via ?core_only=true) — return a minimal scene shape (rendering-only, no debug fields)assets — additional asset descriptors to make available to the agent{
"type": "success" | "partial_success",
"tool": "<tool-name>",
"success": true,
"status": "completed" | "failed" | "awaiting_approval",
"scene": { /* updated scene */ },
"reply": "Human-readable summary of what changed",
"videoUrl": "https://.../output.mp4",
"jobId": "task_...",
"viral_clips": [ /* clip metadata if generated */ ],
"zip_url": "https://.../clips.zip",
"activeTasks": [ /* queued background jobs */ ],
"pendingAsyncJobs": [ /* in-flight job status */ ],
"workflowStepsDetailed": [ /* every executed step */ ],
"workflowSummary": { "title": "...", "summary": "..." },
"verificationPassed": true,
"verificationIssues": [],
"committedToProjectScene": true,
"processingTime": 12.3,
"message": "Same as reply",
"workingMemory": { /* return this in the next call to resume approval-paused runs */ }
}
Failure response (HTTP 4xx/5xx):
{ "success": false, "error": "...", "code": "UNKNOWN_TOOL" | "MISSING_TOOL" | "EXECUTION_ERROR" }
Accept: text/event-stream or ?stream=true)The same event stream the in-product editor uses. Notable event types:
heartbeat — every 15s, keeps the connection alivestatus — phase transitions (request_received, runtime_start, …)mode_select — { mode: "qa" | "action" }thinking, tool_call, tool_result — per-step reasoning visibilitybackground_job_completed — async job finished (B-roll, viral clips, …)workflow_completed — main brain loop done, verification may continuesuccess / partial_success — final terminal payload (same shape as JSON above)error — terminal failureGeneration actions (generate_*, EXPORT_VIDEO) return immediately with a jobId in activeTasks / pendingAsyncJobs. Poll status with:
GET {ADSCENE_API_URL}/api/v1/misc/openclaw/v1/jobs/{jobId}
Authorization: Bearer {ADSCENE_API_KEY}
Response:
{
"success": true,
"jobId": "task_xxx",
"status": "queued" | "processing" | "completed" | "failed",
"progress": 0.74,
"message": "Rendering frame 142 of 192",
"result": { /* artifact URLs / clip metadata on completion */ },
"error": null,
"createdAt": "...",
"updatedAt": "..."
}
To pull async-generated content into the timeline once jobs settle, the agent uses APPLY_PENDING internally — autonomous_edit callers don't need to manage this, but direct callers can issue an autonomous_edit prompt like "apply any pending generated content" to harvest.
If you pass requirePlanApproval: true, the agent stops after planning and the response carries status: "awaiting_approval" + a populated workingMemory. To proceed, call again with:
{
"tool": "autonomous_edit",
"params": {
"prompt": "yes",
"workingMemory": { /* the workingMemory from the previous response */ }
}
}
Accepted approval phrases: yes, y, approve, approved, go, proceed, go ahead, do it, confirm.
Every run flows through three deterministic gates (ActionPermissionGate, ArchitectureControlPlane, EditorSafetyPolicy). Destructive actions (CLEAR, mass deletes) require explicit confirmation params. Verification runs after execution and may trigger up to 2 repair loops; failures surface in verificationPassed: false + verificationIssues[]. Concurrent identical requests for the same (user, project, prompt, scene fingerprint) are deduplicated server-side.
Rate-limited per API key. Processing times vary: read-only ~1–3s, structural edits ~3–10s, async generation 30s–5min per artifact, viral-clip / multi-platform exports several minutes.
# 1) Kick off the viral-clip pipeline (auto-export follow-up queues rendering)
curl -sS -X POST "$ADSCENE_API_URL/api/v1/misc/openclaw/v1/execute" \
-H "Authorization: Bearer $ADSCENE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"tool": "autonomous_edit",
"params": {
"prompt": "Generate 5 viral clips, 15-30 seconds each, focused on the most engaging moments. Add bold captions, vertical reframe, remove silences."
},
"project_id": "my-project"
}' | tee /tmp/result.json | jq -r '.jobId // .activeTasks[0].intent.job_id'
# 2) Poll job status until done
JOB_ID=$(jq -r '.jobId // .activeTasks[0].intent.job_id' /tmp/result.json)
while true; do
STATUS=$(curl -sS "$ADSCENE_API_URL/api/v1/misc/openclaw/v1/jobs/$JOB_ID" \
-H "Authorization: Bearer $ADSCENE_API_KEY" | jq -r '.status')
echo "Status: $STATUS"
[ "$STATUS" = "completed" ] || [ "$STATUS" = "failed" ] && break
sleep 5
done
# 3) Fetch the final artifact URL(s)
curl -sS "$ADSCENE_API_URL/api/v1/misc/openclaw/v1/jobs/$JOB_ID" \
-H "Authorization: Bearer $ADSCENE_API_KEY" | jq '.result'