Install
openclaw skills install shortapi-ai-video-generationUse this skill as an entry point to discover, select, and fetch specific integration parameters for all supported AI video generation models.
openclaw skills install shortapi-ai-video-generationUse this skill to explore and integrate all available Video Generation models through the ShortAPI platform.
ShortAPI provides a unified /api/v1/job/create endpoint for video generation across multiple top-tier providers natively. This skill provides an overview of all available video generation models and how to dynamically acquire the specific JSON schema required to invoke them.
https://api.shortapi.ai/api/v1/job/createtext-to-video, image-to-videoHere is the list of fully supported video generation model IDs you can use:
| Model ID | Description |
|---|---|
google/veo-3.1/text-to-video | Generate videos from text using Veo 3.1 |
google/veo-3.1/image-to-video | Generate videos from images using Veo 3.1 |
google/veo-3.1/extend-video | Extend existing videos using Veo 3.1 |
google/veo-3.1/first-last-frame-to-video | Generate videos from first and last frames (Veo 3.1) |
google/veo-3.1/reference-to-video | Generate videos from reference using Veo 3.1 |
google/veo-3/text-to-video | Generate videos from text using Veo 3 |
google/veo-3/image-to-video | Generate videos from images using Veo 3 |
kwaivgi/kling-3.0/text-to-video | Generate videos from text using Kling 3.0 |
kwaivgi/kling-3.0/image-to-video | Generate videos from images using Kling 3.0 |
kwaivgi/kling-o1/text-to-video | Generate videos from text using Kling O1 |
kwaivgi/kling-o1/image-to-video | Generate videos from images using Kling O1 |
kwaivgi/kling-o1/video-to-video | Transform videos using Kling O1 |
kwaivgi/kling-2.6/text-to-video | Generate videos from text using Kling 2.6 |
kwaivgi/kling-2.6/image-to-video | Generate videos from images using Kling 2.6 |
bytedance/seedance-2.0/text-to-video | Generate videos from text using Seedance 2.0 |
vidu/vidu-q3/text-to-video | Generate videos from text using Vidu Q3 |
vidu/vidu-q3/image-to-video | Generate videos from images using Vidu Q3 |
vidu/vidu-q3/start-end-to-video | Generate videos from start/end frames (Vidu Q3) |
vidu/vidu-q2/text-to-video | Generate videos from text using Vidu Q2 |
vidu/vidu-q2/image-to-video | Generate videos from images using Vidu Q2 |
vidu/vidu-q2/reference-to-video | Generate videos from reference using Vidu Q2 |
vidu/vidu-q2/start-end-to-video | Generate videos from start/end frames (Vidu Q2) |
pixverse/pixverse-5.5/text-to-video | Generate videos from text using Pixverse 5.5 |
pixverse/pixverse-5.5/image-to-video | Generate videos from images using Pixverse 5.5 |
pixverse/pixverse-5.5/transition | Create video transitions using Pixverse 5.5 |
alibaba/wan-2.6/text-to-video | Generate videos from text using Wan 2.6 |
alibaba/wan-2.6/image-to-video | Generate videos from images using Wan 2.6 |
alibaba/wan-2.6/reference-to-video | Generate videos from reference using Wan 2.6 |
Because each video model supports different parameters (such as duration, resolution, aspect_ratio, fps, or advanced controls), you need to fetch the specific model's schema document to construct a valid API request payload.
You MUST first fetch the detailed skill document for the specific <model_id> (e.g. google/veo-3.1/text-to-video) before attempting to construct the POST request payload. DO NOT skip this step. DO NOT hallucinate parameters because different video models have completely different parameter names for the same concept (e.g. one model might use duration while another uses length, one might use resolution while another uses quality).
Send a GET request to:
https://shortapi.ai/api/skill/<model_id>
(For example: GET https://shortapi.ai/api/skill/google/veo-3.1/text-to-video)
This URL will return a Markdown (.md) text document containing the exact Input Parameters Schema for that specific model, alongside code examples. You must parse it to understand which arguments go into the args object.
Using the exact schema document fetched from Step 1, construct a valid JSON payload. Only include arguments that were defined in the document fetched in Step 1. At a minimum, standard structures generally look like this:
{
"model": "<model_id>",
"args": {
"prompt": "Your descriptive text prompt here..."
// ...other model-specific required or optional parameters strictly parsed from Step 1
},
"callback_url": "YOUR_OPTIONAL_WEBHOOK_URL"
}
Make an HTTP POST request to the API Endpoint. Include the Bearer token in the Authorization header.
response=$(curl --request POST \
--url https://api.shortapi.ai/api/v1/job/create \
--header "Authorization: Bearer $SHORTAPI_KEY" \
--header "Content-Type: application/json" \
--data '{
"model": "google/veo-3.1/text-to-video",
"args": {
"prompt": "A cinematic drone shot flying over a futuristic city at sunset"
}
}')
JOB_ID=$(echo "$response" | grep -o '"job_id": *"[^"]*"' | sed 's/"job_id": *//; s/"//g')
Use the returned job_id to poll the query API:
curl --request GET \
--url "https://api.shortapi.ai/api/v1/job/query?id=$JOB_ID" \
--header "Authorization: Bearer $SHORTAPI_KEY"
https://api.shortapi.ai.callback_url parameter is strictly user-defined. The Agent is prohibited from auto-generating or hallucinating this value.SHORTAPI_KEY is only used in the Authorization header to the official API endpoint and must never be included in any callback payload.CRITICAL BEHAVIOR FOR AGENTS:
SHORTAPI_KEY. If it is missing, you MUST proactively direct the user to https://shortapi.ai to obtain their API key. DO NOT ask for or accept any other type of API key (e.g., OpenAI, Anthropic, etc.); the only valid key is the SHORTAPI_KEY from ShortAPI.ai.https://shortapi.ai/api/skill/<model_id>. DO NOT skip this step. DO NOT guess or hallucinate parameters. The document returned in Step 1 is the sole source of truth for the model's input schema.job_id from Step 3, you MUST immediately inform the user that the task has started and release them so they can ask other questions or start new conversations.status: "succeeded"), you must proactively message the user with the final generation results (e.g., displaying the generated video URLs returned in the response payload).<video controls src="video_url"></video> tag to embed an inline video player. For images, use markdown image syntax . For audio/music, use an HTML <audio controls src="audio_url"></audio> tag. The user should be able to see and play the generated result immediately without needing to open a separate browser tab.