Music To A

v1.0.0

add video clips into music-synced videos with this skill. Works with MP4, MOV, AVI, WebM files up to 500MB. content creators use it for adding background mus...

⭐ 0· 36·0 current·0 all-time

bypeandrover adam@peand-rover

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Suspicious

medium confidence

ℹ

Purpose & Capability

The skill's stated purpose (adding/syncing music to videos) matches the API endpoints, required token (NEMO_TOKEN), and described workflows in SKILL.md. However, SKILL.md metadata includes a config path (~/.config/nemovideo/) while the registry metadata lists no required config paths — this mismatch should be clarified.

Instruction Scope

Instructions ask the agent to auto-connect to the remote backend the first time the skill is opened and to generate/store an anonymous token if NEMO_TOKEN is not present — this triggers outbound network activity without an explicit user action. The doc also requires auto-detecting X-Skill-Platform from the agent's install path (reading system/install paths), which is outside the core video-processing necessity and could expose system path information.

✓

Install Mechanism

There is no install spec and no code files; this is instruction-only, so nothing is pulled from external URLs or written to disk by an installer. This minimizes supply-chain risk.

ℹ

Credentials

Only one credential is required (NEMO_TOKEN), which is appropriate for a cloud-rendering API. The SKILL.md metadata references a config path (~/.config/nemovideo/) not declared elsewhere — if the agent will read or write that directory, it should be declared and justified. The skill also instructs not to display raw token values, implying the token will be handled/stored; clarity on where/how tokens/session IDs are persisted is needed.

✓

Persistence & Privilege

The skill does not request always:true or other elevated privileges. Autonomous invocation is allowed (platform default). The only persistence implied is storing a session_id and possibly the anonymous token; the documentation does not say it modifies other skills or global agent settings.

What to consider before installing

This skill generally behaves like a cloud video-rendering integration (NEMO_TOKEN to call nemo-video endpoints and upload clips). Before installing: 1) Verify the skill's source (no homepage or known owner is listed). 2) Ask the publisher to explain the metadata mismatch (the SKILL.md lists ~/.config/nemovideo/ but registry metadata does not). 3) Confirm privacy/retention: uploaded videos will be sent to mega-api-prod.nemovideo.ai — ensure you are comfortable with that. 4) Clarify what the skill does with the token and session_id (where they are stored and for how long). 5) Consider whether you want the skill to auto-connect on first use and generate anonymous tokens automatically; if not, require explicit user consent before any network calls. 6) If you need stronger assurances, request a homepage, a published privacy policy/terms, or host the skill from a known publisher. If you proceed, avoid putting sensitive footage or credentials you wouldn't want uploaded to an external service.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

🎵 Clawdis

EnvNEMO_TOKEN

Primary envNEMO_TOKEN

latestvk9756s470erc5x9t3c7e0yrq3s85292h

36downloads

0stars

1versions

Updated 1d ago

v1.0.0

MIT-0

Getting Started

Ready when you are. Drop your video clips here or describe what you want to make.

Try saying:

"add a 60-second travel montage clip into a 1080p MP4"
"add background music to a travel video and sync it to the cuts"
"adding background music to a video automatically for content creators"

First-Time Connection

When a user first opens this skill, connect to the processing backend automatically. Briefly let them know (e.g. "Setting up...").

Authentication: Check if NEMO_TOKEN is set in the environment. If it is, skip to step 2.

Obtain a free token: Generate a random UUID as client identifier. POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with header X-Client-Id set to that UUID. The response data.token is your NEMO_TOKEN — 100 free credits, valid 7 days.
Create a session: POST to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Authorization: Bearer <token>, Content-Type: application/json, and body {"task_name":"project","language":"<detected>"}. Store the returned session_id for all subsequent requests.

Keep setup communication brief. Don't display raw API responses or token values to the user.

Music to a Video — Add Music and Export Video

Send me your video clips and describe the result you want. The AI music sync runs on remote GPU nodes — nothing to install on your machine.

A quick example: upload a 60-second travel montage clip, type "add background music to a travel video and sync it to the cuts", and you'll get a 1080p MP4 back in roughly 30-60 seconds. All rendering happens server-side.

Worth noting: shorter clips sync music more accurately to on-screen moments.

Matching Input to Actions

User prompts referencing music to a, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says...	Action	Skip SSE?
"export" / "导出" / "download" / "send me the video"	→ §3.5 Export	✅
"credits" / "积分" / "balance" / "余额"	→ §3.3 Credits	✅
"status" / "状态" / "show tracks"	→ §3.4 State	✅
"upload" / "上传" / user sends file	→ §3.2 Upload	✅
Everything else (generate, edit, add BGM…)	→ §3.1 SSE	❌

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

All calls go to https://mega-api-prod.nemovideo.ai. The main endpoints:

Session — POST /api/tasks/me/with-session/nemo_agent with {"task_name":"project","language":"<lang>"}. Gives you a session_id.
Chat (SSE) — POST /run_sse with session_id and your message in new_message.parts[0].text. Set Accept: text/event-stream. Up to 15 min.
Upload — POST /api/upload-video/nemo_agent/me/<sid> — multipart file or JSON with URLs.
Credits — GET /api/credits/balance/simple — returns available, frozen, total.
State — GET /api/state/nemo_agent/me/<sid>/latest — current draft and media info.
Export — POST /api/render/proxy/lambda with render ID and draft JSON. Poll GET /api/render/proxy/lambda/<id> every 30s for completed status and download URL.

Formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

Three attribution headers are required on every request and must match this file's frontmatter:

Header	Value
`X-Skill-Source`	`music-to-a`
`X-Skill-Version`	frontmatter `version`
`X-Skill-Platform`	auto-detect: `clawhub` / `cursor` / `unknown` from install path

Include Authorization: Bearer <NEMO_TOKEN> and all attribution headers on every request — omitting them triggers a 402 on export.

Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Translating GUI Instructions

The backend responds as if there's a visual interface. Map its instructions to API calls:

"click" or "点击" → execute the action via the relevant endpoint
"open" or "打开" → query session state to get the data
"drag/drop" or "拖拽" → send the edit command through SSE
"preview in timeline" → show a text summary of current tracks
"Export" or "导出" → run the export workflow

SSE Event Handling

Event	Action
Text response	Apply GUI translation (§4), present to user
Tool call/result	Process internally, don't forward
`heartbeat` / empty `data:`	Keep waiting. Every 2 min: "⏳ Still working..."
Stream closes	Process final response

~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.

Error Handling

Code	Meaning	Action
0	Success	Continue
1001	Bad/expired token	Re-auth via anonymous-token (tokens expire after 7 days)
1002	Session not found	New session §3.0
2001	No credits	Anonymous: show registration URL with `?bind=<id>` (get `<id>` from create-session or state response when needed). Registered: "Top up credits in your account"
4001	Unsupported file	Show supported formats
4002	File too large	Suggest compress/trim
400	Missing X-Client-Id	Generate Client-Id and retry (see §1)
402	Free plan export blocked	Subscription tier issue, NOT credits. "Register or upgrade your plan to unlock export."
429	Rate limit (1 token/client/7 days)	Retry in 30s once

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "add background music to a travel video and sync it to the cuts" — concrete instructions get better results.

Max file size is 500MB. Stick to MP4, MOV, AVI, WebM for the smoothest experience.

Export as MP4 for widest compatibility.

Common Workflows

Quick edit: Upload → "add background music to a travel video and sync it to the cuts" → Download MP4. Takes 30-60 seconds for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

Comments

Loading comments...