Install
openclaw skills install hidream-model-genGenerate images and videos using Vivago AI (智小象) platform. Supports text-to-image, image-to-image, image-to-video, and keyframe-to-video generation. Use when the user wants to create AI-generated images or videos, transform existing images, or perform image style transfer through the Vivago AI API.
openclaw skills install hidream-model-genIntegration with Vivago AI (智小象) platform for AI-powered image and video generation.
txt2img): Generate images from text descriptionsimg2img): Transform existing images based on prompts, including style transfer, image editing, and multi-image fusiontxt2vid): Generate videos from text descriptionsimg2vid): Generate videos from static imageskeyframe_to_video): Generate transition videos from start and end keyframestemplate_to_video): 181 pre-defined video effectsscripts/
├── vivago_client.py # Main API client
├── template_manager.py # Template management
├── config_loader.py # Configuration loading
├── enums.py # Type enums (TaskStatus, AspectRatio, etc.)
├── exceptions.py # Structured exceptions
└── config/ # Modular configuration files
Before using this skill, you need to obtain a Vivago.ai API Token:
Security Note: The Token is your credential for accessing the API. Please keep it secure and do not share it with others.
Security Note: For secure deployments and AI Agents, the system requires the token to be passed strictly via the HIDREAM_AUTHORIZATION environment variable.
Export it securely in your current session:
export HIDREAM_AUTHORIZATION="your_vivago_api_token"
Note:
STORAGE_AKandSTORAGE_SKare deprecated and removed. The image upload uses secure pre-signed URLs provided by the Vivago API.
Important: By default, all generated resources (JSON results, downloaded images, and videos) will be output to the assets/ directory within the current working folder. Ensure this directory exists or the system has permission to create it.
pip install -r requirements.txt
from scripts import create_client, VivagoClient
from scripts.enums import AspectRatio, PortName, TaskStatus
from scripts.exceptions import TaskFailedError, TaskTimeoutError
# Create client
client = create_client()
# Text to image
results = client.text_to_image(
prompt="a beautiful sunset over mountains",
port=PortName.KLING_IMAGE, # or PortName.NANO_BANANA
wh_ratio=AspectRatio.RATIO_16_9,
batch_size=2
)
# Image to video (using local image)
results = client.image_to_video(
prompt="camera slowly zooming out",
image_uuid=client.upload_image("/path/to/image.jpg"),
port=PortName.V3PRO,
wh_ratio=AspectRatio.RATIO_16_9,
duration=5
)
# Keyframe to video (using start and end images)
results = client.keyframe_to_video(
prompt="smooth transition from start to end",
start_image_uuid=client.upload_image("/path/to/start.jpg"),
end_image_uuid=client.upload_image("/path/to/end.jpg"),
port=PortName.V3PRO,
wh_ratio=AspectRatio.RATIO_16_9,
duration=5
)
# Video Templates - use pre-defined effects
results = client.template_to_video(
image_uuid=client.upload_image("/path/to/image.jpg"),
template="ghibli", # See available templates below
wh_ratio=AspectRatio.RATIO_9_16
)
from scripts.exceptions import (
TaskFailedError,
TaskRejectedError,
TaskTimeoutError,
InvalidPortError
)
try:
results = client.image_to_video(...)
except TaskFailedError as e:
print(f"Task failed: {e.task_id}")
except TaskRejectedError as e:
print(f"Content rejected: {e.reason}")
except TaskTimeoutError as e:
print(f"Timeout after {e.timeout_seconds}s")
except InvalidPortError as e:
print(f"Invalid port: {e.port}, available: {e.available}")
For AI Agents: The easiest way to use this skill is through the provided CLI scripts. They automatically handle API communication, polling, and result parsing. By default, they use HiDream's native models.
Text to Image:
python3 scripts/txt2img.py \
--prompt "a futuristic city" \
--wh-ratio 16:9 \
--batch-size 2 \
--output ./assets/results.json
Note: This defaults to the hidream-txt2img model.
Text to Video:
python3 scripts/txt2vid.py \
--prompt "a cybernetic dragon flying over a futuristic city" \
--wh-ratio 16:9 \
--duration 5 \
--output ./assets/video_results.json
Note: This defaults to the v3Pro model.
Image to Video:
python3 scripts/img2video.py \
--prompt "slow motion falling leaves" \
--image ./assets/source_image.jpg \
--duration 5 \
--output ./assets/video.json
from scripts.enums import (
TaskStatus, # PENDING, COMPLETED, PROCESSING, FAILED, REJECTED
AspectRatio, # RATIO_1_1, RATIO_4_3, RATIO_16_9, etc.
PortCategory, # TEXT_TO_IMAGE, IMAGE_TO_VIDEO, etc.
PortName # KLING_IMAGE, V3PRO, NANO_BANANA, etc.
)
| Feature | Available Versions | Default |
|---|---|---|
| Text to Image | v3L (HiDream), kling-image-o1 | v3L (via port hidream-txt2img) |
| Image to Video | v3Pro, v3L, kling-video-o1 | v3Pro |
| Keyframe to Video | v3Pro, v3L | v3Pro |
Note for AI Agents: By default, all CLI tools (
txt2img.py,txt2vid.py) are pre-configured to use HiDream's native models (hidream-txt2imgfor images,v3Profor videos). You don't need to specify the model unless explicitly requested by the user.
1:1 - Square4:3 - Standard3:4 - Portrait16:9 - Widescreen9:16 - Mobile/Verticalfrom scripts.enums import TaskStatus
TaskStatus.PENDING # 0 - Pending
TaskStatus.COMPLETED # 1 - Completed
TaskStatus.PROCESSING # 2 - Processing
TaskStatus.FAILED # 3 - Failed
TaskStatus.REJECTED # 4 - Rejected (content review)
vivago-ai-skill/
├── scripts/
│ ├── __init__.py # Package exports
│ ├── vivago_client.py # Core API client
│ ├── template_manager.py # Template management
│ ├── config_loader.py # Configuration loader
│ ├── enums.py # Type enums
│ ├── exceptions.py # Exception classes
│ ├── logging_config.py # Logging configuration
│ └── config/ # Modular config files
│ ├── base.json
│ ├── text_to_image.json
│ ├── image_to_video.json
│ └── ...
├── tests/
│ ├── conftest.py # Pytest configuration
│ ├── archive/ # Archived tests
│ └── ...
├── docs/ # Documentation
├── .github/workflows/ # CI configuration
├── requirements.txt
├── README.md
└── SKILL.md # This file
When sending generated content through Feishu (飞书) channel:
| Content Type | Send Method | Example |
|---|---|---|
| Images | ✅ Direct file upload | Attach image file directly |
| Videos | ❌ Must send as link | https://media.vivago.ai/{video_uuid} |
⚠️ Critical: Videos CANNOT be sent as file attachments in Feishu. Always construct and send the direct media URL:
https://media.vivago.ai/b1268f08-ac32-4b83-863f-a419797d768e.mp4
Why: Feishu does not support playable video attachments. Sending video files directly will result in delivery failure or unplayable content.
Images can be downloaded using the correct URL format:
https://storage.vivago.ai/image/{image_name}.jpg
Example:
from scripts import create_client
import requests
client = create_client()
# Generate image
results = client.text_to_image(prompt="a cute cat")
image_name = results[0].get('image', '')
# Download image
image_url = f"https://storage.vivago.ai/image/{image_name}.jpg"
response = requests.get(image_url)
with open("output.jpg", "wb") as f:
f.write(response.content)
Sending via Feishu:
# Download and send through Feishu
image_data = requests.get(image_url).content
# Then send image_data as file attachment via Feishu API
The client handles common errors:
VivagoError (base)
├── VivagoAPIError
├── MissingCredentialError
├── InvalidPortError
├── ImageUploadError
├── TemplateNotFoundError
└── TaskError
├── TaskFailedError
├── TaskRejectedError
└── TaskTimeoutError
The following 181 video templates are available via template_to_video():
| Category | Count | Example Templates |
|---|---|---|
| Style Transfer | 20+ | ghibli, 1930s-2000s vintage styles |
| Harry Potter | 4 | magic_reveal_ravenclaw, gryffindor, hufflepuff, slytherin |
| Wings/Fantasy | 10+ | angel_wings, phoenix_wings, crystal_wings, fire_wings |
| Superheroes | 5+ | iron_man, cat_woman, ghost_rider |
| Dance | 10+ | apt, dadada, dance, limbo_dance |
| Effects | 15+ | ash_out, metallic_liquid, flash_flood |
| Thanksgiving | 10+ | turkey_chasing, autumn_feast, gratitude_photo |
| Comics/Cartoon | 8+ | gta_star, anime_figure, bring_comics_to_life |
| Products | 8+ | glasses_display, music_box, food_product_display |
| Scenes | 20+ | romantic_kiss, graduation, starship_chef |
| Template ID | Description |
|---|---|
ghibli / ghibli2 | Studio Ghibli animation style |
magic_reveal_ravenclaw | Harry Potter Ravenclaw transformation |
magic_reveal_gryffindor | Harry Potter Gryffindor transformation |
magic_reveal_hufflepuff | Harry Potter Hufflepuff transformation |
magic_reveal_slytherin | Harry Potter Slytherin transformation |
iron_man | Iron Man armor assembly |
angel_wings / phoenix_wings / crystal_wings / fire_wings | Wing transformations |
cat_woman | Cat Woman style |
ghost_rider | Ghost Rider flaming skull |
joker | Joker villain style |
mermaid | Mermaid underwater scene |
snow_white | Snow White princess |
barbie | Barbie princess transformation |
me_in_hand | Miniature figure in hand |
music_box | Rotating figure on music box |
anime_figure | Transform into anime figure |
gta_star | GTA game style transformation |
apt / dadada / dance | Dance templates |
ash_out | Disintegrate into ashes |
eye_of_the_storm | Thunder god awakening |
metallic_liquid | Metal mask transformation |
flash_flood | Water/flood effect |
turkey_chasing / turkey_away / turkey_giant | Thanksgiving turkey scenes |
autumn_feast / autumn_stroll | Autumn scenes |
renovation_of_old_photos | Colorize B&W photos |
graduation | Graduation ceremony |
glasses / glasses_display | Glasses/eyewear showcase |
bikini / sexy_man / sexy_pants | Fashion/beach |
romantic_kiss / boyfriends_rose / girlfriends_rose | Romantic scenes |
ai_archaeologist / starship_chef / cyber_cooker | Sci-fi characters |
jungle_reign / panther_queen / roar_of_the_dustlands / tiger_snuggle | Animal companions |
instant_sadness / headphone_vibe / relax | Emotion/reaction |
frost_alert | Cold/freeze effect |
bald_me | Bald transformation |
boom_hair / curl_pop / long_hair | Hair transformations |
muscles | Muscle transformation |
face_punch / gun_point | Action effects |
static_shot / tracking_shot / orbit_shot / push_in / zoom_out / handheld_shot | Camera movements |
earth_zoom_in / earth_zoom_out | Earth zoom effects |
from scripts.template_manager import get_template_manager
manager = get_template_manager()
templates = manager.list_templates()
print(f"Total templates: {len(templates)}")
for tid, name in sorted(templates.items()):
print(f" {tid}: {name}")
from scripts import create_client
client = create_client()
# Upload image
image_uuid = client.upload_image("/path/to/photo.jpg")
# Apply Ghibli style template
results = client.template_to_video(
image_uuid=image_uuid,
template="ghibli",
wh_ratio="9:16"
)
# Harry Potter transformation
results = client.template_to_video(
image_uuid=image_uuid,
template="magic_reveal_ravenclaw",
wh_ratio="9:16"
)