Yt Music

v1.0.0

Get music video MP4 ready to post, without touching a single slider. Upload your video or audio files (MP3, MP4, WAV, JPG, up to 500MB), say something like "...

0· 162·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for linmillsd7/yt-music.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "Yt Music" (linmillsd7/yt-music) from ClawHub.
Skill page: https://clawhub.ai/linmillsd7/yt-music
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Required env vars: NEMO_TOKEN
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Canonical install target

openclaw skills install linmillsd7/yt-music

ClawHub CLI

Package manager switcher

npx clawhub@latest install yt-music
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
medium confidence
Purpose & Capability
The skill claims to create music videos via a cloud render API and only requests a service token (NEMO_TOKEN) and standard API interactions. Requesting a token and session IDs is appropriate for this purpose.
Instruction Scope
Runtime instructions direct the agent to obtain/refresh an anonymous token, create a session, upload files, stream SSE responses, poll render status, and return download URLs — all expected. The skill also instructs the agent to detect install path (to set X-Skill-Platform) and to read its own YAML frontmatter for attribution headers; these are reasonable but imply the agent will inspect certain local paths. The instructions avoid printing tokens/raw JSON, which is good. Reviewers should note that uploads (up to 500MB) and session tokens are sent to an external service, so user data and media will leave the machine.
Install Mechanism
There is no install spec and no code files; the skill is instruction-only, so nothing is written to disk by an installer. This is the lowest-risk install profile.
Credentials
Only a single credential (NEMO_TOKEN) is declared as required, which matches the described API usage. The SKILL.md also describes obtaining an anonymous token via the API if no token is present; it's ambiguous whether that token should be persisted to disk or just kept in-memory. The frontmatter also mentions a config path (~/.config/nemovideo/) which could imply local credential/config storage — the registry metadata earlier claimed no required config paths, so this is an inconsistency to clarify.
Persistence & Privilege
The skill is user-invocable and not always-enabled. It will create and hold a session_id and use the NEMO_TOKEN for API calls; this is normal for cloud service integrations. It does not request elevated or system-wide privileges.
Assessment
This skill appears to do what it says: it uploads media to a cloud rendering API and returns a rendered MP4. Before installing/use: (1) Confirm you trust the service domain (mega-api-prod.nemovideo.ai) because your uploaded media and any tokens will be sent there. (2) Use a throwaway or short-lived token for testing rather than a long-lived secret. (3) Note a minor inconsistency: the skill's YAML mentions a config path (~/.config/nemovideo/) but registry metadata did not — clarify whether tokens or files are stored locally. (4) Understand that large media files (up to 500MB) will leave your machine; check privacy/licensing implications. (5) If you need higher assurance, ask the publisher for the source or a code-backed skill (not instruction-only) and verify how/where tokens and session IDs are stored and revoked.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

🎵 Clawdis
EnvNEMO_TOKEN
Primary envNEMO_TOKEN
latestvk971szsrzn6xn9txz08tfy3v2x856g3y
162downloads
0stars
1versions
Updated 6d ago
v1.0.0
MIT-0

Getting Started

Got video or audio files to work with? Send it over and tell me what you need — I'll take care of the AI music video creation.

Try saying:

  • "create a 3-minute YouTube audio track with album art image into a 1080p MP4"
  • "create a looping music video with visualizer and track info overlay"
  • "creating YouTube music videos from audio tracks for musicians and content creators"

Automatic Setup

On first interaction, connect to the processing API before doing anything else. Show a brief status like "Setting things up...".

Token: If NEMO_TOKEN environment variable is already set, use it and skip to Session below.

Free token: Generate a UUID as client identifier, then POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with header X-Client-Id: <uuid>. The response field data.token becomes your NEMO_TOKEN (100 credits, 7-day expiry).

Session: POST to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer auth and body {"task_name":"project"}. Save session_id from the response.

Confirm to the user you're connected and ready. Don't print tokens or raw JSON.

YT Music — Create Music Videos for YouTube

This tool takes your video or audio files and runs AI music video creation through a cloud rendering pipeline. You upload, describe what you want, and download the result.

Say you have a 3-minute YouTube audio track with album art image and want to create a looping music video with visualizer and track info overlay — the backend processes it in about 30-60 seconds and hands you a 1080p MP4.

Tip: pairing audio with a static image or loop still meets YouTube upload requirements.

Matching Input to Actions

User prompts referencing yt music, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says...ActionSkip SSE?
"export" / "导出" / "download" / "send me the video"→ §3.5 Export
"credits" / "积分" / "balance" / "余额"→ §3.3 Credits
"status" / "状态" / "show tracks"→ §3.4 State
"upload" / "上传" / user sends file→ §3.2 Upload
Everything else (generate, edit, add BGM…)→ §3.1 SSE

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

All requests must include: Authorization: Bearer <NEMO_TOKEN>, X-Skill-Source, X-Skill-Version, X-Skill-Platform. Missing attribution headers will cause export to fail with 402.

Skill attribution — read from this file's YAML frontmatter at runtime:

  • X-Skill-Source: yt-music
  • X-Skill-Version: from frontmatter version
  • X-Skill-Platform: detect from install path (~/.clawhub/clawhub, ~/.cursor/skills/cursor, else unknown)

API base: https://mega-api-prod.nemovideo.ai

Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"<lang>"} — returns task_id, session_id.

Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.

Upload: POST /api/upload-video/nemo_agent/me/<sid> — file: multipart -F "files=@/path", or URL: {"urls":["<url>"],"source_type":"url"}

Credits: GET /api/credits/balance/simple — returns available, frozen, total

Session state: GET /api/state/nemo_agent/me/<sid>/latest — key fields: data.state.draft, data.state.video_infos, data.state.generated_media

Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_<ts>","sessionId":"<sid>","draft":<json>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/<id> every 30s until status = completed. Download URL at output.url.

Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

Error Codes

  • 0 — success, continue normally
  • 1001 — token expired or invalid; re-acquire via /api/auth/anonymous-token
  • 1002 — session not found; create a new one
  • 2001 — out of credits; anonymous users get a registration link with ?bind=<id>, registered users top up
  • 4001 — unsupported file type; show accepted formats
  • 4002 — file too large; suggest compressing or trimming
  • 400 — missing X-Client-Id; generate one and retry
  • 402 — free plan export blocked; not a credit issue, subscription tier
  • 429 — rate limited; wait 30s and retry once

Backend Response Translation

The backend assumes a GUI exists. Translate these into API actions:

Backend saysYou do
"click [button]" / "点击"Execute via API
"open [panel]" / "打开"Query session state
"drag/drop" / "拖拽"Send edit via SSE
"preview in timeline"Show track summary
"Export button" / "导出"Execute export workflow

Reading the SSE Stream

Text events go straight to the user (after GUI translation). Tool calls stay internal. Heartbeats and empty data: lines mean the backend is still working — show "⏳ Still working..." every 2 minutes.

About 30% of edit operations close the stream without any text. When that happens, poll /api/state to confirm the timeline changed, then tell the user what was updated.

Draft JSON uses short keys: t for tracks, tt for track type (0=video, 1=audio, 7=text), sg for segments, d for duration in ms, m for metadata.

Example timeline summary:

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Common Workflows

Quick edit: Upload → "create a looping music video with visualizer and track info overlay" → Download MP4. Takes 30-60 seconds for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "create a looping music video with visualizer and track info overlay" — concrete instructions get better results.

Max file size is 500MB. Stick to MP3, MP4, WAV, JPG for the smoothest experience.

Export as MP4 with H.264 codec for best YouTube upload compatibility.

Comments

Loading comments...