Install
openclaw skills install vargaiGenerate AI videos, images, speech, and music using varg. Use when creating videos, animations, talking characters, slideshows, product showcases, social content, or single-asset generation. Supports zero-install cloud rendering (just API key + curl) and full local rendering (bun + ffmpeg). Triggers: "create a video", "generate video", "make a slideshow", "talking head", "product video", "generate image", "text to speech", "varg", "vargai", "render video", "lip sync", "captions".
openclaw skills install vargaiBefore generating anything, determine the rendering mode.
Run bash scripts/setup.sh from the skill directory to auto-detect, or check manually:
| bun | ffmpeg | Mode |
|---|---|---|
| No | No | Cloud Render -- read cloud-render.md |
| Yes | No | Cloud Render -- read cloud-render.md |
| Yes | Yes | Local Render (recommended) -- read local-render.md |
VARG_API_KEY is required for all modes. Get one at https://varg.ai
Everything you know about varg is likely outdated. Always verify against this skill and its references before writing code.
Image({...}) creates media, <Clip> composes timeline. Never write <Image prompt="..." />.Video({ prompt: { images: [img] } }) takes exactly one image. Multiple images cause errors.providerOptions: { varg: {...} }, never fal, when going through the gateway (both modes).# Submit TSX code to the render service
curl -s -X POST https://render.varg.ai/api/render \
-H "Authorization: Bearer $VARG_API_KEY" \
-H "Content-Type: application/json" \
-d '{"code": "const img = Image({ model: fal.imageModel(\"nano-banana-pro\"), prompt: \"a cabin in mountains at sunset\", aspectRatio: \"16:9\" });\nexport default (<Render width={1920} height={1080}><Clip duration={3}>{img}</Clip></Render>);"}'
# Poll for result (repeat until "completed" or "failed")
curl -s https://render.varg.ai/api/render/jobs/JOB_ID \
-H "Authorization: Bearer $VARG_API_KEY"
Full details: cloud-render.md
/** @jsxImportSource vargai */
import { Render, Clip, Image } from "vargai/react"
import { createVarg } from "@vargai/gateway"
const varg = createVarg({ apiKey: process.env.VARG_API_KEY! })
const img = Image({
model: varg.imageModel("nano-banana-pro"),
prompt: "a cabin in mountains at sunset",
aspectRatio: "16:9"
})
export default (
<Render width={1920} height={1080}>
<Clip duration={3}>{img}</Clip>
</Render>
)
bunx vargai render video.tsx --preview # free preview
bunx vargai render video.tsx --verbose # full render (costs credits)
Full details: local-render.md
For one-off images, videos, speech, or music without building a multi-clip template:
curl -X POST https://api.varg.ai/v1/image \
-H "Authorization: Bearer $VARG_API_KEY" \
-d '{"model": "nano-banana-pro", "prompt": "a sunset over mountains"}'
Full API reference: gateway-api.md
Video code has two layers: media generation (function calls) and composition (JSX).
// 1. GENERATE media via function calls
const img = Image({ model: ..., prompt: "..." })
const vid = Video({ model: ..., prompt: { text: "...", images: [img] }, duration: 5 })
const voice = Speech({ model: ..., voice: "rachel", children: "Hello!" })
// 2. COMPOSE via JSX tree
export default (
<Render width={1080} height={1920}>
<Music model={...} prompt="upbeat electronic" duration={10} volume={0.3} />
<Clip duration={5}>
{vid}
<Title position="bottom">Welcome</Title>
</Clip>
<Captions src={voice} style="tiktok" withAudio />
</Render>
)
| Component | Type | Purpose |
|---|---|---|
Image() | Function call | Generate still image |
Video() | Function call | Generate video (text-to-video or image-to-video) |
Speech() | Function call | Text-to-speech audio |
<Render> | JSX | Root container -- sets width, height, fps |
<Clip> | JSX | Timeline segment -- duration, transitions |
<Music> | JSX | Background audio (always set duration!) |
<Captions> | JSX | Subtitle track from Speech |
<Title> | JSX | Text overlay |
<Overlay> | JSX | Positioned layer |
<Split> / <Grid> | JSX | Layout helpers |
Full props: components.md
| Cloud Render | Local Render |
|---|---|
| No imports needed | import { ... } from "vargai/react" |
fal.imageModel("nano-banana-pro") | varg.imageModel("nano-banana-pro") |
fal.videoModel("kling-v3") | varg.videoModel("kling-v3") |
elevenlabs.speechModel("eleven_v3") | varg.speechModel("eleven_v3") |
| Globals are auto-injected | Must call createVarg() |
| Scenario | Use | Auth |
|---|---|---|
| New project, simplest setup | varg.*Model() (gateway) | VARG_API_KEY only |
| Existing project with fal/elevenlabs keys | fal.*Model() / elevenlabs.*Model() | Individual keys |
| Cloud render via curl/API | Gateway (only option) | VARG_API_KEY |
| Need $0 billing with own keys | Gateway + BYOK headers | VARG_API_KEY + provider keys |
| Specific provider feature not in gateway | Direct provider | Individual key |
Default recommendation: Use the gateway (varg.*Model() + VARG_API_KEY). It handles routing, caching, billing, and works with a single key.
--preview generates free placeholders to validate structure.Load these on demand based on what you need:
| Need | Reference | When to load |
|---|---|---|
| Render via API | cloud-render.md | No bun/ffmpeg, or user wants cloud rendering |
| Render locally | local-render.md | bun + ffmpeg available |
| Patterns & workflows | recipes.md | Talking head, character consistency, slideshow, lipsync |
| Model selection | models.md | Choosing models, checking prices, duration constraints |
| Component props | components.md | Need detailed props for any component |
| Better prompts | prompting.md | User wants cinematic / high-quality results |
| REST API | gateway-api.md | Single-asset generation or Render API details |
| Debugging | common-errors.md | Something failed or produced unexpected results |
| Full examples | templates.md | Need complete copy-paste-ready templates |
| BYOK keys | byok.md | Using your own provider API keys for $0 billing |