Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

Unified Video Lyrics Free

v1.0.0

Get lyrics-overlaid videos ready to post, without touching a single slider. Upload your video with audio (MP4, MOV, AVI, WebM, up to 500MB), say something li...

0· 29·0 current·0 all-time
bypeandrover adam@peand-rover
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
The skill's name/description (create lyrics-overlaid videos) aligns with the network API calls and the single required credential (NEMO_TOKEN). The endpoints and actions described (upload, render, export, credits) are consistent with a cloud video-processing service.
!
Instruction Scope
The SKILL.md instructs the agent to automatically obtain an anonymous token if NEMO_TOKEN is not set, create sessions, store session_id, and perform uploads and exports to an external API. It also tells the agent not to display raw API responses or token values to the user — a directive that could hide sensitive data flows. The instructions reference local paths (uploading files using multipart '@ /path' and install-path detection) which could cause uploads of user files if triggered; the skill does not specify secure storage locations or explicit consent for token creation/storage.
Install Mechanism
There is no install spec and no code files — this is instruction-only, which reduces direct file-system/write risk. No external binaries or downloads are requested.
!
Credentials
The registry metadata lists only NEMO_TOKEN (reasonable for a remote service). However, the SKILL.md frontmatter declares a config path (~/.config/nemovideo/) and the instructions examine install paths (~/.clawhub/, ~/.cursor/skills/) to derive headers. Those filesystem checks were not reflected in the registry 'required config paths' field — an incoherence. The skill also asks the agent to generate and store tokens/sessions without specifying where or how (which can lead to unexpected credential persistence).
Persistence & Privilege
always:false and no requests to modify other skills — good. The skill can be invoked autonomously (default), which combined with its ability to create tokens, upload files, and call external endpoints increases blast radius but is not itself an unusual privilege for an integration skill.
Scan Findings in Context
[regex-scan:no-findings] expected: The scanner found no code files to analyze; this is expected because the skill is instruction-only (SKILL.md). Absence of findings is not evidence of safety — the runtime instructions are the primary surface to review.
What to consider before installing
What to consider before installing: - The skill will call an external API (mega-api-prod.nemovideo.ai), upload video files, and use a bearer token (NEMO_TOKEN). Only provide this token if you trust that domain and its privacy/security practices. - The SKILL.md instructs automatic anonymous token acquisition and storing of session tokens; ask where tokens/session IDs will be stored, and whether they will persist on disk or be accessible to other apps. If you prefer, set NEMO_TOKEN yourself rather than letting the skill create one. - The instructions explicitly tell the agent not to show raw API responses or token values — consider this a red flag for hidden data flows and request that the skill surface important information (e.g., when it uploads or when export URLs are ready). - There is an inconsistency: the skill's frontmatter references a local config path (~/.config/nemovideo/) and checks install paths (~/.clawhub/, ~/.cursor/skills/) but those were not declared in the registry metadata. Confirm whether the skill will read/write these paths and why. - Be cautious about sensitive videos or private content: files will be uploaded to an external service. If the content is sensitive, avoid using this skill or verify the provider's data retention policy. - If you proceed, monitor network activity during first runs and consider limiting the skill's permissions (e.g., do not supply a long-lived token or supply it only temporarily). Ask the skill author to clarify storage, retention, and visibility of tokens and uploaded files.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

🎵 Clawdis
EnvNEMO_TOKEN
Primary envNEMO_TOKEN
latestvk9748tffe3nfe4qzdmzr346nph855kp4
29downloads
0stars
1versions
Updated 19h ago
v1.0.0
MIT-0

Getting Started

Ready when you are. Drop your video with audio here or describe what you want to make.

Try saying:

  • "add a 3-minute music video recording into a 1080p MP4"
  • "sync lyrics to the music video and display them as on-screen text"
  • "adding synchronized lyrics to music videos for free for musicians and content creators"

First-Time Connection

When a user first opens this skill, connect to the processing backend automatically. Briefly let them know (e.g. "Setting up...").

Authentication: Check if NEMO_TOKEN is set in the environment. If it is, skip to step 2.

  1. Obtain a free token: Generate a random UUID as client identifier. POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with header X-Client-Id set to that UUID. The response data.token is your NEMO_TOKEN — 100 free credits, valid 7 days.
  2. Create a session: POST to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Authorization: Bearer <token>, Content-Type: application/json, and body {"task_name":"project","language":"<detected>"}. Store the returned session_id for all subsequent requests.

Keep setup communication brief. Don't display raw API responses or token values to the user.

Unified Video Lyrics Free — Sync and overlay video lyrics

Send me your video with audio and describe the result you want. The AI lyrics sync runs on remote GPU nodes — nothing to install on your machine.

A quick example: upload a 3-minute music video recording, type "sync lyrics to the music video and display them as on-screen text", and you'll get a 1080p MP4 back in roughly 30-60 seconds. All rendering happens server-side.

Worth noting: shorter song clips sync lyrics faster and more accurately.

Matching Input to Actions

User prompts referencing unified video lyrics free, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says...ActionSkip SSE?
"export" / "导出" / "download" / "send me the video"→ §3.5 Export
"credits" / "积分" / "balance" / "余额"→ §3.3 Credits
"status" / "状态" / "show tracks"→ §3.4 State
"upload" / "上传" / user sends file→ §3.2 Upload
Everything else (generate, edit, add BGM…)→ §3.1 SSE

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

Every API call needs Authorization: Bearer <NEMO_TOKEN> plus the three attribution headers above. If any header is missing, exports return 402.

Headers are derived from this file's YAML frontmatter. X-Skill-Source is unified-video-lyrics-free, X-Skill-Version comes from the version field, and X-Skill-Platform is detected from the install path (~/.clawhub/ = clawhub, ~/.cursor/skills/ = cursor, otherwise unknown).

API base: https://mega-api-prod.nemovideo.ai

Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"<lang>"} — returns task_id, session_id.

Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.

Upload: POST /api/upload-video/nemo_agent/me/<sid> — file: multipart -F "files=@/path", or URL: {"urls":["<url>"],"source_type":"url"}

Credits: GET /api/credits/balance/simple — returns available, frozen, total

Session state: GET /api/state/nemo_agent/me/<sid>/latest — key fields: data.state.draft, data.state.video_infos, data.state.generated_media

Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_<ts>","sessionId":"<sid>","draft":<json>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/<id> every 30s until status = completed. Download URL at output.url.

Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

Error Codes

  • 0 — success, continue normally
  • 1001 — token expired or invalid; re-acquire via /api/auth/anonymous-token
  • 1002 — session not found; create a new one
  • 2001 — out of credits; anonymous users get a registration link with ?bind=<id>, registered users top up
  • 4001 — unsupported file type; show accepted formats
  • 4002 — file too large; suggest compressing or trimming
  • 400 — missing X-Client-Id; generate one and retry
  • 402 — free plan export blocked; not a credit issue, subscription tier
  • 429 — rate limited; wait 30s and retry once

Backend Response Translation

The backend assumes a GUI exists. Translate these into API actions:

Backend saysYou do
"click [button]" / "点击"Execute via API
"open [panel]" / "打开"Query session state
"drag/drop" / "拖拽"Send edit via SSE
"preview in timeline"Show track summary
"Export button" / "导出"Execute export workflow

Reading the SSE Stream

Text events go straight to the user (after GUI translation). Tool calls stay internal. Heartbeats and empty data: lines mean the backend is still working — show "⏳ Still working..." every 2 minutes.

About 30% of edit operations close the stream without any text. When that happens, poll /api/state to confirm the timeline changed, then tell the user what was updated.

Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "sync lyrics to the music video and display them as on-screen text" — concrete instructions get better results.

Max file size is 500MB. Stick to MP4, MOV, AVI, WebM for the smoothest experience.

Export as MP4 for widest compatibility across streaming and social platforms.

Common Workflows

Quick edit: Upload → "sync lyrics to the music video and display them as on-screen text" → Download MP4. Takes 30-60 seconds for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

Comments

Loading comments...