Photo Generator

v1.0.0

Get AI generated photos ready to post, without touching a single slider. Upload your text or images (JPG, PNG, WEBP, HEIC, up to 200MB), say something like "...

⭐ 0· 34·0 current·0 all-time

by@dsewell-583h0

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for dsewell-583h0/photo-generator.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "Photo Generator" (dsewell-583h0/photo-generator) from ClawHub.
Skill page: https://clawhub.ai/dsewell-583h0/photo-generator
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Required env vars: NEMO_TOKEN
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install photo-generator

ClawHub CLI

Package manager switcher

npx clawhub@latest install photo-generator

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Benign

medium confidence

✓

Purpose & Capability

The skill name and description (AI photo/video generation) match the runtime instructions: calls to a nemovideo API, upload endpoints, export workflow, and a single bearer token (NEMO_TOKEN). Requesting an API token is expected for this purpose.

ℹ

Instruction Scope

Instructions are focused on remote rendering and SSE-based editing flows. They additionally require reading this skill's YAML frontmatter at runtime and detecting the agent's install path (~/.clawhub/, ~/.cursor/skills/) to set an X-Skill-Platform header. Reading those local paths is not strictly needed to perform render jobs and is a scope expansion worth noting because it accesses the user's home filesystem.

✓

Install Mechanism

This is an instruction-only skill with no install spec and no code files — lowest-risk delivery. There is nothing downloaded or written by an installer.

ℹ

Credentials

The only declared credential is NEMO_TOKEN, which is proportional for calling the remote API. However, the SKILL.md frontmatter references a config path (~/.config/nemovideo/) that implies reading local config/token files, while the registry summary listed no required config paths — an inconsistency that should be resolved. The skill can also obtain an anonymous token itself via the public anonymous-token endpoint.

✓

Persistence & Privilege

always is false and the skill does not request permanent platform-wide privileges. It does not modify other skills' configs and contains no install-time hooks.

Assessment

This skill looks like a legitimate remote photo/video generation integration that needs a single NEMO_TOKEN to call mega-api-prod.nemovideo.ai. Before installing: (1) confirm you trust the nemovideo domain and are comfortable uploading media to a remote service, (2) prefer using the anonymous token flow if you don't want to supply a long-lived token, (3) be aware the skill may read its own frontmatter and probe common install/config paths in your home directory (~/.clawhub/, ~/.cursor/skills/, ~/.config/nemovideo/) to set attribution headers — if you are uncomfortable with that filesystem access, avoid enabling the skill, and (4) note the small inconsistency: the SKILL.md lists a config path though the registry metadata did not; ask the publisher to clarify which local paths the skill will read and why.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

🖼️ Clawdis

EnvNEMO_TOKEN

Primary envNEMO_TOKEN

latestvk9701n8xz73b57wv6cwffpbsf585mq5z

34downloads

0stars

1versions

Updated 13h ago

v1.0.0

MIT-0

Getting Started

Share your text or images and I'll get started on AI photo generation. Or just tell me what you're thinking.

Try saying:

"generate my text or images"
"export 1080p MP4"
"generate a realistic photo of a"

Automatic Setup

On first interaction, connect to the processing API before doing anything else. Show a brief status like "Setting things up...".

Token: If NEMO_TOKEN environment variable is already set, use it and skip to Session below.

Free token: Generate a UUID as client identifier, then POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with header X-Client-Id: <uuid>. The response field data.token becomes your NEMO_TOKEN (100 credits, 7-day expiry).

Session: POST to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer auth and body {"task_name":"project"}. Save session_id from the response.

Confirm to the user you're connected and ready. Don't print tokens or raw JSON.

Photo Generator — Generate Photos from Text Prompts

Send me your text or images and describe the result you want. The AI photo generation runs on remote GPU nodes — nothing to install on your machine.

A quick example: upload a description like 'sunset over mountain lake, cinematic style', type "generate a realistic photo of a product on a white background", and you'll get a 1080p MP4 back in roughly 20-40 seconds. All rendering happens server-side.

Worth noting: more detailed prompts produce more accurate results.

Matching Input to Actions

User prompts referencing photo generator, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says...	Action	Skip SSE?
"export" / "导出" / "download" / "send me the video"	→ §3.5 Export	✅
"credits" / "积分" / "balance" / "余额"	→ §3.3 Credits	✅
"status" / "状态" / "show tracks"	→ §3.4 State	✅
"upload" / "上传" / user sends file	→ §3.2 Upload	✅
Everything else (generate, edit, add BGM…)	→ §3.1 SSE	❌

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

All calls go to https://mega-api-prod.nemovideo.ai. The main endpoints:

Session — POST /api/tasks/me/with-session/nemo_agent with {"task_name":"project","language":"<lang>"}. Gives you a session_id.
Chat (SSE) — POST /run_sse with session_id and your message in new_message.parts[0].text. Set Accept: text/event-stream. Up to 15 min.
Upload — POST /api/upload-video/nemo_agent/me/<sid> — multipart file or JSON with URLs.
Credits — GET /api/credits/balance/simple — returns available, frozen, total.
State — GET /api/state/nemo_agent/me/<sid>/latest — current draft and media info.
Export — POST /api/render/proxy/lambda with render ID and draft JSON. Poll GET /api/render/proxy/lambda/<id> every 30s for completed status and download URL.

Formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

Skill attribution — read from this file's YAML frontmatter at runtime:

X-Skill-Source: photo-generator
X-Skill-Version: from frontmatter version
X-Skill-Platform: detect from install path (~/.clawhub/ → clawhub, ~/.cursor/skills/ → cursor, else unknown)

Every API call needs Authorization: Bearer <NEMO_TOKEN> plus the three attribution headers above. If any header is missing, exports return 402.

Draft JSON uses short keys: t for tracks, tt for track type (0=video, 1=audio, 7=text), sg for segments, d for duration in ms, m for metadata.

Example timeline summary:

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Backend Response Translation

The backend assumes a GUI exists. Translate these into API actions:

Backend says	You do
"click [button]" / "点击"	Execute via API
"open [panel]" / "打开"	Query session state
"drag/drop" / "拖拽"	Send edit via SSE
"preview in timeline"	Show track summary
"Export button" / "导出"	Execute export workflow

SSE Event Handling

Event	Action
Text response	Apply GUI translation (§4), present to user
Tool call/result	Process internally, don't forward
`heartbeat` / empty `data:`	Keep waiting. Every 2 min: "⏳ Still working..."
Stream closes	Process final response

~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.

Error Codes

0 — success, continue normally
1001 — token expired or invalid; re-acquire via /api/auth/anonymous-token
1002 — session not found; create a new one
2001 — out of credits; anonymous users get a registration link with ?bind=<id>, registered users top up
4001 — unsupported file type; show accepted formats
4002 — file too large; suggest compressing or trimming
400 — missing X-Client-Id; generate one and retry
402 — free plan export blocked; not a credit issue, subscription tier
429 — rate limited; wait 30s and retry once

Common Workflows

Quick edit: Upload → "generate a realistic photo of a product on a white background" → Download MP4. Takes 20-40 seconds for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "generate a realistic photo of a product on a white background" — concrete instructions get better results.

Max file size is 200MB. Stick to JPG, PNG, WEBP, HEIC for the smoothest experience.

Export as PNG to preserve quality when using generated images in other projects.

Comments

Loading comments...