Ffmpeg Merge Video

v1.0.0

Tell me what you need and I'll help you merge video files quickly and precisely using ffmpeg-merge-video. Whether you're stitching together interview segment...

0· 10·0 current·0 all-time
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
The skill claims to merge videos and its runtime instructions target a cloud FFmpeg service (mega-api-prod.nemovideo.ai). Requiring a NEMO_TOKEN to call that API is proportional. Note: the skill name/intro might lead some users to expect local ffmpeg commands, but the SKILL.md explicitly describes a cloud backend, so this is coherent.
Instruction Scope
The instructions direct the agent to obtain/use a NEMO_TOKEN, create a session, upload files (multipart or URL), start renders, and stream SSE responses. That means user video files and metadata will be sent to the external API. The instructions also ask the agent to read the skill YAML frontmatter and detect the agent install path (~/.clawhub, ~/.cursor/skills/) — i.e., some limited filesystem inspection. These behaviors are expected for a cloud-processing skill but are important privacy considerations (uploads to a third party).
Install Mechanism
This is an instruction-only skill with no install spec or code files, so nothing is downloaded or written to disk by the skill itself. Lowest install risk.
Credentials
The only declared environment variable is NEMO_TOKEN (primaryEnv), which directly maps to authenticating to the described API. The SKILL.md also provides a fallback anonymous-token flow if no token is present. No unrelated secrets or broad credential access are requested.
Persistence & Privilege
The skill is not force-included (always: false) and does not request elevated platform privileges. It does instruct reading of certain local paths for attribution detection, but it does not request modifying other skills or persistent system-wide changes.
Assessment
This skill sends your video files and job data to a third-party service (https://mega-api-prod.nemovideo.ai) and requires a NEMO_TOKEN to operate. Before installing or using it: (1) confirm you are comfortable uploading the videos (sensitive or private footage will leave your machine); (2) only set a trusted NEMO_TOKEN (don’t paste tokens into chat); the skill can also request an anonymous token for short-term use; (3) expect the agent to read local paths (install path detection) and to read file paths you give it in order to upload them — limit file selections to intended files; (4) if you prefer purely local processing for privacy, use a local ffmpeg workflow instead; (5) there is no installer or code to inspect beyond the SKILL.md, and the static scan found no code to analyze. If you want, ask the skill author for their privacy/retention policy and the API’s data handling rules before proceeding.

Like a lobster shell, security has layers — review code before you run it.

latestvk975ry94mwrt52a3ad0yfd7ms58418wv

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Runtime requirements

🎬 Clawdis
EnvNEMO_TOKEN
Primary envNEMO_TOKEN

SKILL.md

Getting Started

Welcome! I'm here to help you merge video files using FFmpeg — whether that's joining two clips, concatenating a whole playlist of segments, or combining footage from different sources into one clean file. Describe your videos and what you want the output to look like, and let's get started.

Try saying:

  • "I have 5 MP4 clips recorded from the same camera. How do I merge them in order into one file without re-encoding?"
  • "I need to combine an MKV and an MP4 file into a single MP4. They have different resolutions — what's the best FFmpeg command?"
  • "My GoPro split a long recording into 4 parts. How do I join them back into one seamless video using FFmpeg?"

Automatic Setup

On first interaction, connect to the processing API before doing anything else. Show a brief status like "Setting things up...".

Token: If NEMO_TOKEN environment variable is already set, use it and skip to Session below.

Free token: Generate a UUID as client identifier, then POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with header X-Client-Id: <uuid>. The response field data.token becomes your NEMO_TOKEN (100 credits, 7-day expiry).

Session: POST to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer auth and body {"task_name":"project"}. Save session_id from the response.

Confirm to the user you're connected and ready. Don't print tokens or raw JSON.

Stitch Any Video Files Together Without the Hassle

Merging video clips sounds simple until you're dealing with files from different cameras, varying codecs, or mismatched resolutions. That's where this skill steps in. The ffmpeg-merge-video skill gives you a direct, conversational way to describe what you want — and get back the exact FFmpeg command or workflow to make it happen.

Whether you need to concatenate ten short clips into one continuous video, join two recordings that were split mid-session, or combine a series of exported segments from a video editor, this skill knows the right approach for each scenario. It distinguishes between lossless concat operations and situations that require re-encoding, so you always get the best quality for your use case.

This skill is useful for videographers assembling final cuts, developers building video pipelines, and anyone who regularly works with raw footage and needs fast, accurate FFmpeg guidance without digging through documentation every time.

Routing Your Merge Requests

When you submit a merge job, ClawHub parses your clip list, concat strategy, and output codec preferences to route the request to the appropriate FFmpeg processing pipeline.

User says...ActionSkip SSE?
"export" / "导出" / "download" / "send me the video"→ §3.5 Export
"credits" / "积分" / "balance" / "余额"→ §3.3 Credits
"status" / "状态" / "show tracks"→ §3.4 State
"upload" / "上传" / user sends file→ §3.2 Upload
Everything else (generate, edit, add BGM…)→ §3.1 SSE

Cloud FFmpeg API Reference

The backend spins up an isolated FFmpeg worker that ingests your source segments, builds a concat demuxer manifest or filter_complex chain depending on stream compatibility, then encodes the muxed output to your specified container. Remuxing matched-codec clips is near-instant, while transcode-merge jobs involving mismatched frame rates or pixel formats will take longer depending on total duration and resolution.

Skill attribution — read from this file's YAML frontmatter at runtime:

  • X-Skill-Source: ffmpeg-merge-video
  • X-Skill-Version: from frontmatter version
  • X-Skill-Platform: detect from install path (~/.clawhub/clawhub, ~/.cursor/skills/cursor, else unknown)

All requests must include: Authorization: Bearer <NEMO_TOKEN>, X-Skill-Source, X-Skill-Version, X-Skill-Platform. Missing attribution headers will cause export to fail with 402.

API base: https://mega-api-prod.nemovideo.ai

Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"<lang>"} — returns task_id, session_id.

Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.

Upload: POST /api/upload-video/nemo_agent/me/<sid> — file: multipart -F "files=@/path", or URL: {"urls":["<url>"],"source_type":"url"}

Credits: GET /api/credits/balance/simple — returns available, frozen, total

Session state: GET /api/state/nemo_agent/me/<sid>/latest — key fields: data.state.draft, data.state.video_infos, data.state.generated_media

Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_<ts>","sessionId":"<sid>","draft":<json>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/<id> every 30s until status = completed. Download URL at output.url.

Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

SSE Event Handling

EventAction
Text responseApply GUI translation (§4), present to user
Tool call/resultProcess internally, don't forward
heartbeat / empty data:Keep waiting. Every 2 min: "⏳ Still working..."
Stream closesProcess final response

~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.

Backend Response Translation

The backend assumes a GUI exists. Translate these into API actions:

Backend saysYou do
"click [button]" / "点击"Execute via API
"open [panel]" / "打开"Query session state
"drag/drop" / "拖拽"Send edit via SSE
"preview in timeline"Show track summary
"Export button" / "导出"Execute export workflow

Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Error Handling

CodeMeaningAction
0SuccessContinue
1001Bad/expired tokenRe-auth via anonymous-token (tokens expire after 7 days)
1002Session not foundNew session §3.0
2001No creditsAnonymous: show registration URL with ?bind=<id> (get <id> from create-session or state response when needed). Registered: "Top up credits in your account"
4001Unsupported fileShow supported formats
4002File too largeSuggest compress/trim
400Missing X-Client-IdGenerate Client-Id and retry (see §1)
402Free plan export blockedSubscription tier issue, NOT credits. "Register or upgrade your plan to unlock export."
429Rate limit (1 token/client/7 days)Retry in 30s once

Use Cases

The ffmpeg-merge-video skill covers a wide range of real-world scenarios where combining video files is necessary. Action camera users often deal with automatically split files — GoPro, DJI, and similar devices break recordings into chunks due to file size limits, and this skill helps rejoin them cleanly without quality loss.

Filmmakers and editors working with dailies or multi-part exports can use this skill to assemble segments exported from Premiere, Resolve, or Final Cut into a single deliverable. Developers building automated video pipelines — such as recording systems, screen capture tools, or surveillance archives — can use it to understand how to programmatically concatenate video files using FFmpeg's concat demuxer or filter.

Content creators who record in multiple takes, podcasters who edit out sections and need to rejoin the remaining parts, and educators assembling lecture clips into a single course video all benefit from precise, format-aware merge workflows this skill provides.

Common Workflows

One of the most common ffmpeg-merge-video workflows is the lossless concat using a file list — ideal when all clips share the same codec, resolution, and frame rate. This skill walks you through creating the input list file and running the concat demuxer command correctly, avoiding the re-encoding overhead that wastes time and degrades quality.

When clips don't match in format or resolution, the skill guides you through using the concat filter with scale and setsar adjustments to normalize everything before merging. This is the right path for combining footage from a phone with clips from a DSLR, for example.

For developers, the skill also covers batch merge scenarios — how to loop through a directory of numbered clips and build the FFmpeg command dynamically. Whether you're working in bash, Python subprocess calls, or just need a one-time manual command, the workflow guidance adapts to your context and gets you to a working result faster.

Files

1 total
Select a file
Select a file to preview.

Comments

Loading comments…