Text To Video Create Ai

v1.0.0

Get AI-generated videos ready to post, without touching a single slider. Upload your text prompts (TXT, DOCX, PDF, SRT, up to 200MB), say something like "tur...

0· 34·0 current·0 all-time
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
medium confidence
Purpose & Capability
Name/description (text-to-video for marketers) align with required env var NEMO_TOKEN, the listed endpoints, upload/export APIs, and expected artifacts (MP4, audio, images). The declared config path (~/.config/nemovideo/) and token are coherent with connecting to a hosted rendering backend.
Instruction Scope
SKILL.md instructs the agent to obtain or use a NEMO_TOKEN, create sessions, upload user files (up to 200MB), stream SSE events, poll for render status, and return download URLs — all expected. It also asks the agent to detect install path (~/.clawhub/, ~/.cursor/skills/) to populate attribution headers; this requires checking the local filesystem (minor privacy surface). Instructions explicitly hide raw API responses/token values from the user (internal handling). No instructions request unrelated system files, other credentials, or broad data collection beyond user uploads.
Install Mechanism
No install spec or code is present (instruction-only skill). No downloads or archive extraction are requested; nothing will be written to disk by an installer step beyond normal runtime session storage described in the doc.
Credentials
Only one environment credential is required (NEMO_TOKEN), which matches the documented API Authorization header. No unrelated secrets or multiple external credentials are requested.
Persistence & Privilege
Skill is not always-enabled and does not request elevated platform privileges. It does instruct storing the session_id/token for subsequent calls (normal for a remote session-based API). It does ask the agent to detect install paths for attribution headers, which is a limited filesystem probe but not a broad privilege escalation.
Assessment
This skill appears to do what it says: it will upload your text and media to an external nemovideo.ai backend, obtain and store short-lived anonymous tokens if none are provided, and return download links for rendered videos. Before installing, consider: 1) you are sending content (and potentially sensitive text/media) to an external, unvetted service — review any privacy/terms you can find for that service; 2) the skill probes a couple of local paths to build attribution headers (minor privacy leak about your environment); 3) if you have sensitive data, do not upload it here; and 4) prefer supplying your own NEMO_TOKEN only if you trust the service. Because the skill is instruction-only and the source/homepage are unknown, proceed only if you trust this third-party backend.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

🎬 Clawdis
EnvNEMO_TOKEN
Primary envNEMO_TOKEN
latestvk975t3wgew777nb7m0evgq1v4d85bv2s
34downloads
0stars
1versions
Updated 21h ago
v1.0.0
MIT-0

Getting Started

Got text prompts to work with? Send it over and tell me what you need — I'll take care of the AI video creation.

Try saying:

  • "generate a 100-word product description into a 1080p MP4"
  • "turn this script into a 30-second promotional video with visuals and music"
  • "generating videos from written scripts or text prompts for marketers"

First-Time Connection

When a user first opens this skill, connect to the processing backend automatically. Briefly let them know (e.g. "Setting up...").

Authentication: Check if NEMO_TOKEN is set in the environment. If it is, skip to step 2.

  1. Obtain a free token: Generate a random UUID as client identifier. POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with header X-Client-Id set to that UUID. The response data.token is your NEMO_TOKEN — 100 free credits, valid 7 days.
  2. Create a session: POST to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Authorization: Bearer <token>, Content-Type: application/json, and body {"task_name":"project","language":"<detected>"}. Store the returned session_id for all subsequent requests.

Keep setup communication brief. Don't display raw API responses or token values to the user.

Text to Video Create AI — Generate Videos from Text

This tool takes your text prompts and runs AI video creation through a cloud rendering pipeline. You upload, describe what you want, and download the result.

Say you have a 100-word product description and want to turn this script into a 30-second promotional video with visuals and music — the backend processes it in about 1-2 minutes and hands you a 1080p MP4.

Tip: shorter, clearer prompts produce more accurate video results.

Matching Input to Actions

User prompts referencing text to video create ai, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says...ActionSkip SSE?
"export" / "导出" / "download" / "send me the video"→ §3.5 Export
"credits" / "积分" / "balance" / "余额"→ §3.3 Credits
"status" / "状态" / "show tracks"→ §3.4 State
"upload" / "上传" / user sends file→ §3.2 Upload
Everything else (generate, edit, add BGM…)→ §3.1 SSE

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

All calls go to https://mega-api-prod.nemovideo.ai. The main endpoints:

  1. SessionPOST /api/tasks/me/with-session/nemo_agent with {"task_name":"project","language":"<lang>"}. Gives you a session_id.
  2. Chat (SSE)POST /run_sse with session_id and your message in new_message.parts[0].text. Set Accept: text/event-stream. Up to 15 min.
  3. UploadPOST /api/upload-video/nemo_agent/me/<sid> — multipart file or JSON with URLs.
  4. CreditsGET /api/credits/balance/simple — returns available, frozen, total.
  5. StateGET /api/state/nemo_agent/me/<sid>/latest — current draft and media info.
  6. ExportPOST /api/render/proxy/lambda with render ID and draft JSON. Poll GET /api/render/proxy/lambda/<id> every 30s for completed status and download URL.

Formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

Skill attribution — read from this file's YAML frontmatter at runtime:

  • X-Skill-Source: text-to-video-create-ai
  • X-Skill-Version: from frontmatter version
  • X-Skill-Platform: detect from install path (~/.clawhub/clawhub, ~/.cursor/skills/cursor, else unknown)

Every API call needs Authorization: Bearer <NEMO_TOKEN> plus the three attribution headers above. If any header is missing, exports return 402.

Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Backend Response Translation

The backend assumes a GUI exists. Translate these into API actions:

Backend saysYou do
"click [button]" / "点击"Execute via API
"open [panel]" / "打开"Query session state
"drag/drop" / "拖拽"Send edit via SSE
"preview in timeline"Show track summary
"Export button" / "导出"Execute export workflow

Reading the SSE Stream

Text events go straight to the user (after GUI translation). Tool calls stay internal. Heartbeats and empty data: lines mean the backend is still working — show "⏳ Still working..." every 2 minutes.

About 30% of edit operations close the stream without any text. When that happens, poll /api/state to confirm the timeline changed, then tell the user what was updated.

Error Handling

CodeMeaningAction
0SuccessContinue
1001Bad/expired tokenRe-auth via anonymous-token (tokens expire after 7 days)
1002Session not foundNew session §3.0
2001No creditsAnonymous: show registration URL with ?bind=<id> (get <id> from create-session or state response when needed). Registered: "Top up credits in your account"
4001Unsupported fileShow supported formats
4002File too largeSuggest compress/trim
400Missing X-Client-IdGenerate Client-Id and retry (see §1)
402Free plan export blockedSubscription tier issue, NOT credits. "Register or upgrade your plan to unlock export."
429Rate limit (1 token/client/7 days)Retry in 30s once

Common Workflows

Quick edit: Upload → "turn this script into a 30-second promotional video with visuals and music" → Download MP4. Takes 1-2 minutes for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "turn this script into a 30-second promotional video with visuals and music" — concrete instructions get better results.

Max file size is 200MB. Stick to TXT, DOCX, PDF, SRT for the smoothest experience.

Export as MP4 for widest compatibility.

Comments

Loading comments...