Image To Video Mp4

v1.0.0

Get animated MP4 video ready to post, without touching a single slider. Upload your still images (JPG, PNG, WEBP, HEIC, up to 200MB), say something like "tur...

0· 88·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for susan4731-wilfordf/image-to-video-mp4.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "Image To Video Mp4" (susan4731-wilfordf/image-to-video-mp4) from ClawHub.
Skill page: https://clawhub.ai/susan4731-wilfordf/image-to-video-mp4
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Required env vars: NEMO_TOKEN
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install image-to-video-mp4

ClawHub CLI

Package manager switcher

npx clawhub@latest install image-to-video-mp4
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
The skill's name/description (convert images to MP4 via cloud render) aligns with the declared NEMO_TOKEN credential and the API endpoints in SKILL.md. The included metadata config path (~/.config/nemovideo/) and install-path detection are plausible (reading local client config or inferring platform) though not strictly required for basic operation—this is a minor mismatch but explainable.
Instruction Scope
Most instructions stay on task: authenticate (env token or anonymous-token), create a session, upload images, stream SSE for progress, and run export. Two small scope ambiguities: (1) the docs show multipart uploads using a filesystem path ("-F 'files=@/path'") which could imply reading local disk — in practice the agent should use files explicitly uploaded in-chat; (2) the skill asks to 'save' session_id and to read YAML frontmatter and install path for attribution, which means the agent will read certain local/config locations. These are not obviously malicious but are broader than strictly converting images and should be noted.
Install Mechanism
There is no install spec and no code files — instruction-only. This is low-risk from an installation/execution standpoint because nothing is downloaded or written to disk by an installer.
Credentials
Only NEMO_TOKEN is required as an API credential, which is proportional for a cloud-rendering service. The metadata also declares access to ~/.config/nemovideo/, which could contain additional account data; the SKILL.md's behavior (using tokens and sessions) justifies needing a token but not necessarily reading arbitrary config files—clarify whether reading that config is necessary before granting access.
Persistence & Privilege
The skill does not request 'always: true' and has no install persistence. It can be invoked autonomously (the platform default) to call external APIs using the provided token; this is expected behavior for a remote-rendering skill.
Assessment
This skill appears to do what it says: it will upload images you give it to nemovideo's cloud API and return an MP4. Before installing/use: 1) Confirm you trust the domain https://mega-api-prod.nemovideo.ai and the service's privacy policy because your images (and any embedded metadata) will be transmitted off-device. 2) Prefer using an anonymous or limited-scope token rather than a long-lived account token; anonymous tokens the skill describes expire in 7 days. 3) Be cautious about allowing access to local config (~/.config/nemovideo/) or arbitrary file paths—only provide the images you intend to upload. 4) If you need stronger guarantees, ask the skill author to clarify why it needs to read local config/install-paths and whether session tokens are persisted or only kept in-memory.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

🖼️ Clawdis
EnvNEMO_TOKEN
Primary envNEMO_TOKEN
latestvk97cjnxpggrktwse4796f9by5x85989w
88downloads
0stars
1versions
Updated 1w ago
v1.0.0
MIT-0

Getting Started

Share your still images and I'll get started on AI video creation. Or just tell me what you're thinking.

Try saying:

  • "convert my still images"
  • "export 1080p MP4"
  • "turn my photos into a smooth"

Automatic Setup

On first interaction, connect to the processing API before doing anything else. Show a brief status like "Setting things up...".

Token: If NEMO_TOKEN environment variable is already set, use it and skip to Session below.

Free token: Generate a UUID as client identifier, then POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with header X-Client-Id: <uuid>. The response field data.token becomes your NEMO_TOKEN (100 credits, 7-day expiry).

Session: POST to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer auth and body {"task_name":"project"}. Save session_id from the response.

Confirm to the user you're connected and ready. Don't print tokens or raw JSON.

Image to Video MP4 — Convert Photos into MP4 Videos

Drop your still images in the chat and tell me what you need. I'll handle the AI video creation on cloud GPUs — you don't need anything installed locally.

Here's a typical use: you send a three product photos in JPG format, ask for turn my photos into a smooth MP4 video with transitions, and about 30-60 seconds later you've got a MP4 file ready to download. The whole thing runs at 1080p by default.

One thing worth knowing — using fewer than 10 images keeps processing fast and transitions smooth.

Matching Input to Actions

User prompts referencing image to video mp4, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says...ActionSkip SSE?
"export" / "导出" / "download" / "send me the video"→ §3.5 Export
"credits" / "积分" / "balance" / "余额"→ §3.3 Credits
"status" / "状态" / "show tracks"→ §3.4 State
"upload" / "上传" / user sends file→ §3.2 Upload
Everything else (generate, edit, add BGM…)→ §3.1 SSE

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

Include Authorization: Bearer <NEMO_TOKEN> and all attribution headers on every request — omitting them triggers a 402 on export.

Skill attribution — read from this file's YAML frontmatter at runtime:

  • X-Skill-Source: image-to-video-mp4
  • X-Skill-Version: from frontmatter version
  • X-Skill-Platform: detect from install path (~/.clawhub/clawhub, ~/.cursor/skills/cursor, else unknown)

API base: https://mega-api-prod.nemovideo.ai

Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"<lang>"} — returns task_id, session_id.

Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.

Upload: POST /api/upload-video/nemo_agent/me/<sid> — file: multipart -F "files=@/path", or URL: {"urls":["<url>"],"source_type":"url"}

Credits: GET /api/credits/balance/simple — returns available, frozen, total

Session state: GET /api/state/nemo_agent/me/<sid>/latest — key fields: data.state.draft, data.state.video_infos, data.state.generated_media

Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_<ts>","sessionId":"<sid>","draft":<json>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/<id> every 30s until status = completed. Download URL at output.url.

Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

Error Handling

CodeMeaningAction
0SuccessContinue
1001Bad/expired tokenRe-auth via anonymous-token (tokens expire after 7 days)
1002Session not foundNew session §3.0
2001No creditsAnonymous: show registration URL with ?bind=<id> (get <id> from create-session or state response when needed). Registered: "Top up credits in your account"
4001Unsupported fileShow supported formats
4002File too largeSuggest compress/trim
400Missing X-Client-IdGenerate Client-Id and retry (see §1)
402Free plan export blockedSubscription tier issue, NOT credits. "Register or upgrade your plan to unlock export."
429Rate limit (1 token/client/7 days)Retry in 30s once

Translating GUI Instructions

The backend responds as if there's a visual interface. Map its instructions to API calls:

  • "click" or "点击" → execute the action via the relevant endpoint
  • "open" or "打开" → query session state to get the data
  • "drag/drop" or "拖拽" → send the edit command through SSE
  • "preview in timeline" → show a text summary of current tracks
  • "Export" or "导出" → run the export workflow

Reading the SSE Stream

Text events go straight to the user (after GUI translation). Tool calls stay internal. Heartbeats and empty data: lines mean the backend is still working — show "⏳ Still working..." every 2 minutes.

About 30% of edit operations close the stream without any text. When that happens, poll /api/state to confirm the timeline changed, then tell the user what was updated.

Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "turn my photos into a smooth MP4 video with transitions" — concrete instructions get better results.

Max file size is 200MB. Stick to JPG, PNG, WEBP, HEIC for the smoothest experience.

Export as MP4 for widest compatibility across all platforms and devices.

Common Workflows

Quick edit: Upload → "turn my photos into a smooth MP4 video with transitions" → Download MP4. Takes 30-60 seconds for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

Comments

Loading comments...