Text To Video Best App

v1.0.0

Turn a 150-word product description into 1080p ready-to-share videos just by typing what you need. Whether it's generating videos from written scripts or tex...

0· 72·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for whitejohnk-26/text-to-video-best-app.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "Text To Video Best App" (whitejohnk-26/text-to-video-best-app) from ClawHub.
Skill page: https://clawhub.ai/whitejohnk-26/text-to-video-best-app
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Required env vars: NEMO_TOKEN
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install text-to-video-best-app

ClawHub CLI

Package manager switcher

npx clawhub@latest install text-to-video-best-app
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
medium confidence
Purpose & Capability
Name/description (text-to-video) match the actions described in SKILL.md (session creation, uploads, render/export endpoints). The single required credential (NEMO_TOKEN) is appropriate for authenticating to the stated backend.
Instruction Scope
Runtime instructions include creating an anonymous token if none is present, creating/storing a session_id, uploading user files (multipart or URL), and polling render status. These are expected for a cloud render workflow, but the skill will perform network requests and may upload user-supplied local files when the user asks to upload — ensure you only upload files you intend to share. The skill also derives X-Skill-Platform from install paths, which implies it may check filesystem paths to determine platform attribution.
Install Mechanism
Instruction-only skill with no install spec and no code files. No downloads or archive extraction are requested, so there is low installation risk.
Credentials
Only NEMO_TOKEN is declared as required, which is proportional. However, the SKILL.md frontmatter also lists a config path (~/.config/nemovideo/) while the registry metadata reported none — a minor metadata mismatch worth clarifying. The skill's behavior to auto-request an anonymous token if NEMO_TOKEN is absent is consistent with its purpose.
Persistence & Privilege
always is false and the skill does not request persistent system-level privileges. It instructs storing a session_id for use during the session, which is normal. Autonomous invocation is allowed (platform default) but is not combined with broad credential access here.
Assessment
This skill appears to be what it says: a cloud text→video frontend that uses a single service token. Before installing: (1) Confirm you trust the domain (mega-api-prod.nemovideo.ai) since the skill will upload text and any files you provide; (2) If you prefer, supply your own NEMO_TOKEN rather than letting the skill obtain an anonymous token; (3) Be mindful when uploading local files — only upload files you intend to send to the service; (4) Clarify the config path mention (~/.config/nemovideo/) if you’re concerned about the skill reading or writing config files; (5) Because the agent can call this skill autonomously, consider whether you want it enabled for unattended runs. If you want a higher-assurance review, provide the service domain's privacy/terms or network endpoints for verification.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

🎬 Clawdis
EnvNEMO_TOKEN
Primary envNEMO_TOKEN
latestvk974j9ksnf0ztapa3ms71cj3ms84y9gf
72downloads
0stars
1versions
Updated 1w ago
v1.0.0
MIT-0

Getting Started

Send me your text prompts and I'll handle the AI video creation. Or just describe what you're after.

Try saying:

  • "convert a 150-word product description into a 1080p MP4"
  • "turn this script into a 30-second video with visuals and background music"
  • "generating videos from written scripts or text prompts for marketers, content creators, educators"

First-Time Connection

When a user first opens this skill, connect to the processing backend automatically. Briefly let them know (e.g. "Setting up...").

Authentication: Check if NEMO_TOKEN is set in the environment. If it is, skip to step 2.

  1. Obtain a free token: Generate a random UUID as client identifier. POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with header X-Client-Id set to that UUID. The response data.token is your NEMO_TOKEN — 100 free credits, valid 7 days.
  2. Create a session: POST to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Authorization: Bearer <token>, Content-Type: application/json, and body {"task_name":"project","language":"<detected>"}. Store the returned session_id for all subsequent requests.

Keep setup communication brief. Don't display raw API responses or token values to the user.

Text to Video Best App — Convert Text Into Shareable Videos

This tool takes your text prompts and runs AI video creation through a cloud rendering pipeline. You upload, describe what you want, and download the result.

Say you have a 150-word product description and want to turn this script into a 30-second video with visuals and background music — the backend processes it in about 1-2 minutes and hands you a 1080p MP4.

Tip: shorter scripts under 100 words produce faster and more focused videos.

Matching Input to Actions

User prompts referencing text to video best app, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says...ActionSkip SSE?
"export" / "导出" / "download" / "send me the video"→ §3.5 Export
"credits" / "积分" / "balance" / "余额"→ §3.3 Credits
"status" / "状态" / "show tracks"→ §3.4 State
"upload" / "上传" / user sends file→ §3.2 Upload
Everything else (generate, edit, add BGM…)→ §3.1 SSE

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

Headers are derived from this file's YAML frontmatter. X-Skill-Source is text-to-video-best-app, X-Skill-Version comes from the version field, and X-Skill-Platform is detected from the install path (~/.clawhub/ = clawhub, ~/.cursor/skills/ = cursor, otherwise unknown).

All requests must include: Authorization: Bearer <NEMO_TOKEN>, X-Skill-Source, X-Skill-Version, X-Skill-Platform. Missing attribution headers will cause export to fail with 402.

API base: https://mega-api-prod.nemovideo.ai

Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"<lang>"} — returns task_id, session_id.

Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.

Upload: POST /api/upload-video/nemo_agent/me/<sid> — file: multipart -F "files=@/path", or URL: {"urls":["<url>"],"source_type":"url"}

Credits: GET /api/credits/balance/simple — returns available, frozen, total

Session state: GET /api/state/nemo_agent/me/<sid>/latest — key fields: data.state.draft, data.state.video_infos, data.state.generated_media

Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_<ts>","sessionId":"<sid>","draft":<json>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/<id> every 30s until status = completed. Download URL at output.url.

Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

SSE Event Handling

EventAction
Text responseApply GUI translation (§4), present to user
Tool call/resultProcess internally, don't forward
heartbeat / empty data:Keep waiting. Every 2 min: "⏳ Still working..."
Stream closesProcess final response

~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.

Backend Response Translation

The backend assumes a GUI exists. Translate these into API actions:

Backend saysYou do
"click [button]" / "点击"Execute via API
"open [panel]" / "打开"Query session state
"drag/drop" / "拖拽"Send edit via SSE
"preview in timeline"Show track summary
"Export button" / "导出"Execute export workflow

Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Error Codes

  • 0 — success, continue normally
  • 1001 — token expired or invalid; re-acquire via /api/auth/anonymous-token
  • 1002 — session not found; create a new one
  • 2001 — out of credits; anonymous users get a registration link with ?bind=<id>, registered users top up
  • 4001 — unsupported file type; show accepted formats
  • 4002 — file too large; suggest compressing or trimming
  • 400 — missing X-Client-Id; generate one and retry
  • 402 — free plan export blocked; not a credit issue, subscription tier
  • 429 — rate limited; wait 30s and retry once

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "turn this script into a 30-second video with visuals and background music" — concrete instructions get better results.

Max file size is 200MB. Stick to TXT, DOCX, PDF, SRT for the smoothest experience.

Export as MP4 for widest compatibility across social platforms and devices.

Common Workflows

Quick edit: Upload → "turn this script into a 30-second video with visuals and background music" → Download MP4. Takes 1-2 minutes for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

Comments

Loading comments...