hidream-model-gen

v1.0.5

Generate images and videos using Vivago AI (智小象) platform. Supports text-to-image, image-to-image, image-to-video, and keyframe-to-video generation. Use when...

0· 298·0 current·0 all-time
byharry zhu@zhy2015

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for zhy2015/hidream-model-gen.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "hidream-model-gen" (zhy2015/hidream-model-gen) from ClawHub.
Skill page: https://clawhub.ai/zhy2015/hidream-model-gen
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Required env vars: HIDREAM_AUTHORIZATION
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install hidream-model-gen

ClawHub CLI

Package manager switcher

npx clawhub@latest install hidream-model-gen
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description match the code and manifest: the code implements text->image, img->video, keyframe->video, template handling, and uploads to vivago.ai endpoints. The single required env var (HIDREAM_AUTHORIZATION) is exactly the API bearer token the client uses.
Instruction Scope
SKILL.md and the CLI scripts instruct the agent to call the packaged Python scripts, upload local images to Vivago (via pre-signed URLs), poll for results, and save outputs to an assets/ directory. The runtime instructions and code operate within that scope and do not request or read unrelated system secrets or remote endpoints outside vivago.ai and its storage domains.
Install Mechanism
No registry install spec is declared (skill is instruction-only in registry), but a requirements.txt and full Python source are included in the package. Installation is the normal pip-based flow (pip install -r requirements.txt). No downloads from arbitrary URLs or extract/install steps were found.
Credentials
The skill requires one environment variable (HIDREAM_AUTHORIZATION) which is used as a Bearer token. Some scripts reference HIDREAM_TOKEN as a fallback and mention deprecated STORAGE_AK/STORAGE_SK — these are not required for normal operation but appear as backward-compatible fallbacks. No unrelated credentials (e.g., AWS keys, GitHub tokens) are requested.
Persistence & Privilege
The skill does not request always: true and does not modify other skills or system-wide agent settings. It writes generated assets to a local assets/ directory (and /tmp for intermediate files) which is expected behavior for a generator tool.
Assessment
This skill appears to do what it says: it uploads images and requests generation from Vivago (vivago.ai) using the HIDREAM_AUTHORIZATION bearer token. Before installing or running it: 1) Only provide a Vivago API token you control and understand (keep it secret). 2) Be aware that any images you pass will be uploaded to Vivago's servers — avoid sending sensitive or private images. 3) Verify you trust the Vivago service and check its terms/privacy (retention & reuse of images). 4) Because the package includes executable Python code, inspect the code if you run it in sensitive environments; consider running it in an isolated environment (container/VM) and review network egress policies. 5) Note minor inconsistencies: the registry lists no install spec even though requirements.txt and source files are included, and some scripts reference an alternate env var (HIDREAM_TOKEN) and deprecated STORAGE_AK/STORAGE_SK — you can ignore those or set HIDREAM_AUTHORIZATION as instructed. 6) If you need higher assurance, ask the publisher for provenance (homepage or repository) — the skill's source/homepage is not provided in the metadata.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

EnvHIDREAM_AUTHORIZATION
latestvk97e3652ghmycgv8wxqseemek5836mk1
298downloads
0stars
6versions
Updated 1mo ago
v1.0.5
MIT-0

Vivago AI Skill

Integration with Vivago AI (智小象) platform for AI-powered image and video generation.

Supported Features

Image Generation

  • Text to Image (txt2img): Generate images from text descriptions
  • Image to Image (img2img): Transform existing images based on prompts, including style transfer, image editing, and multi-image fusion

Video Generation

  • Text to Video (txt2vid): Generate videos from text descriptions
  • Image to Video (img2vid): Generate videos from static images
  • Keyframe to Video (keyframe_to_video): Generate transition videos from start and end keyframes
  • Video Templates (template_to_video): 181 pre-defined video effects
  • Supports multiple model versions (v3Pro, v3L, kling-video-o1)

Additional Features

  • Image upload to Vivago storage
  • Batch generation (up to 4 images)
  • Multiple aspect ratios (1:1, 4:3, 3:4, 16:9, 9:16)
  • Automatic retry with polling

Architecture

Core Modules

scripts/
├── vivago_client.py       # Main API client
├── template_manager.py    # Template management
├── config_loader.py       # Configuration loading
├── enums.py              # Type enums (TaskStatus, AspectRatio, etc.)
├── exceptions.py         # Structured exceptions
└── config/               # Modular configuration files

Code Quality

  • Type Safety: Complete type annotations and enums
  • Exception Handling: Structured exception hierarchy
  • CI/CD: GitHub Actions for automated testing
  • Modular Config: Split configuration files for maintainability

Setup

Prerequisites

Before using this skill, you need to obtain a Vivago.ai API Token:

Step 1: Login to Vivago.ai

  1. Visit https://vivago.ai/ and log in to your account
  2. Check your remaining credits and consider subscribing to a suitable plan if needed

Step 2: Obtain Your Token

  1. After logging in, visit https://vivago.ai/prod-api/user/token
  2. The page will return your API Token (in JWT format)
  3. Copy this Token for configuration

Security Note: The Token is your credential for accessing the API. Please keep it secure and do not share it with others.

Environment Variables

Security Note: For secure deployments and AI Agents, the system requires the token to be passed strictly via the HIDREAM_AUTHORIZATION environment variable.

Export it securely in your current session:

export HIDREAM_AUTHORIZATION="your_vivago_api_token"

Note: STORAGE_AK and STORAGE_SK are deprecated and removed. The image upload uses secure pre-signed URLs provided by the Vivago API.

File Output Configuration

Important: By default, all generated resources (JSON results, downloaded images, and videos) will be output to the assets/ directory within the current working folder. Ensure this directory exists or the system has permission to create it.

Installation

pip install -r requirements.txt

Usage

Python API

from scripts import create_client, VivagoClient
from scripts.enums import AspectRatio, PortName, TaskStatus
from scripts.exceptions import TaskFailedError, TaskTimeoutError

# Create client
client = create_client()

# Text to image
results = client.text_to_image(
    prompt="a beautiful sunset over mountains",
    port=PortName.KLING_IMAGE,  # or PortName.NANO_BANANA
    wh_ratio=AspectRatio.RATIO_16_9,
    batch_size=2
)

# Image to video (using local image)
results = client.image_to_video(
    prompt="camera slowly zooming out",
    image_uuid=client.upload_image("/path/to/image.jpg"),
    port=PortName.V3PRO,
    wh_ratio=AspectRatio.RATIO_16_9,
    duration=5
)

# Keyframe to video (using start and end images)
results = client.keyframe_to_video(
    prompt="smooth transition from start to end",
    start_image_uuid=client.upload_image("/path/to/start.jpg"),
    end_image_uuid=client.upload_image("/path/to/end.jpg"),
    port=PortName.V3PRO,
    wh_ratio=AspectRatio.RATIO_16_9,
    duration=5
)

# Video Templates - use pre-defined effects
results = client.template_to_video(
    image_uuid=client.upload_image("/path/to/image.jpg"),
    template="ghibli",  # See available templates below
    wh_ratio=AspectRatio.RATIO_9_16
)

Error Handling

from scripts.exceptions import (
    TaskFailedError,
    TaskRejectedError,
    TaskTimeoutError,
    InvalidPortError
)

try:
    results = client.image_to_video(...)
except TaskFailedError as e:
    print(f"Task failed: {e.task_id}")
except TaskRejectedError as e:
    print(f"Content rejected: {e.reason}")
except TaskTimeoutError as e:
    print(f"Timeout after {e.timeout_seconds}s")
except InvalidPortError as e:
    print(f"Invalid port: {e.port}, available: {e.available}")

Command Line (Best for AI Agents)

For AI Agents: The easiest way to use this skill is through the provided CLI scripts. They automatically handle API communication, polling, and result parsing. By default, they use HiDream's native models.

Text to Image:

python3 scripts/txt2img.py \
  --prompt "a futuristic city" \
  --wh-ratio 16:9 \
  --batch-size 2 \
  --output ./assets/results.json

Note: This defaults to the hidream-txt2img model.

Text to Video:

python3 scripts/txt2vid.py \
  --prompt "a cybernetic dragon flying over a futuristic city" \
  --wh-ratio 16:9 \
  --duration 5 \
  --output ./assets/video_results.json

Note: This defaults to the v3Pro model.

Image to Video:

python3 scripts/img2video.py \
  --prompt "slow motion falling leaves" \
  --image ./assets/source_image.jpg \
  --duration 5 \
  --output ./assets/video.json

API Reference

Enums

from scripts.enums import (
    TaskStatus,      # PENDING, COMPLETED, PROCESSING, FAILED, REJECTED
    AspectRatio,     # RATIO_1_1, RATIO_4_3, RATIO_16_9, etc.
    PortCategory,    # TEXT_TO_IMAGE, IMAGE_TO_VIDEO, etc.
    PortName         # KLING_IMAGE, V3PRO, NANO_BANANA, etc.
)

Models

FeatureAvailable VersionsDefault
Text to Imagev3L (HiDream), kling-image-o1v3L (via port hidream-txt2img)
Image to Videov3Pro, v3L, kling-video-o1v3Pro
Keyframe to Videov3Pro, v3Lv3Pro

Note for AI Agents: By default, all CLI tools (txt2img.py, txt2vid.py) are pre-configured to use HiDream's native models (hidream-txt2img for images, v3Pro for videos). You don't need to specify the model unless explicitly requested by the user.

Aspect Ratios

  • 1:1 - Square
  • 4:3 - Standard
  • 3:4 - Portrait
  • 16:9 - Widescreen
  • 9:16 - Mobile/Vertical

Task Status Codes

from scripts.enums import TaskStatus

TaskStatus.PENDING     # 0 - Pending
TaskStatus.COMPLETED   # 1 - Completed
TaskStatus.PROCESSING  # 2 - Processing
TaskStatus.FAILED      # 3 - Failed
TaskStatus.REJECTED    # 4 - Rejected (content review)

File Structure

vivago-ai-skill/
├── scripts/
│   ├── __init__.py         # Package exports
│   ├── vivago_client.py    # Core API client
│   ├── template_manager.py # Template management
│   ├── config_loader.py    # Configuration loader
│   ├── enums.py            # Type enums
│   ├── exceptions.py       # Exception classes
│   ├── logging_config.py   # Logging configuration
│   └── config/             # Modular config files
│       ├── base.json
│       ├── text_to_image.json
│       ├── image_to_video.json
│       └── ...
├── tests/
│   ├── conftest.py         # Pytest configuration
│   ├── archive/            # Archived tests
│   └── ...
├── docs/                   # Documentation
├── .github/workflows/      # CI configuration
├── requirements.txt
├── README.md
└── SKILL.md               # This file

Important Notes

Feishu Channel Messaging Guidelines

When sending generated content through Feishu (飞书) channel:

Content TypeSend MethodExample
Images✅ Direct file uploadAttach image file directly
VideosMust send as linkhttps://media.vivago.ai/{video_uuid}

⚠️ Critical: Videos CANNOT be sent as file attachments in Feishu. Always construct and send the direct media URL:

https://media.vivago.ai/b1268f08-ac32-4b83-863f-a419797d768e.mp4

Why: Feishu does not support playable video attachments. Sending video files directly will result in delivery failure or unplayable content.

Image Download

Images can be downloaded using the correct URL format:

https://storage.vivago.ai/image/{image_name}.jpg

Example:

from scripts import create_client
import requests

client = create_client()

# Generate image
results = client.text_to_image(prompt="a cute cat")
image_name = results[0].get('image', '')

# Download image
image_url = f"https://storage.vivago.ai/image/{image_name}.jpg"
response = requests.get(image_url)
with open("output.jpg", "wb") as f:
    f.write(response.content)

Sending via Feishu:

# Download and send through Feishu
image_data = requests.get(image_url).content
# Then send image_data as file attachment via Feishu API

Asynchronous Processing

  • API calls are asynchronous with automatic polling
  • Images are automatically resized to max 1024px on longest side before upload
  • Video generation supports 5 or 10 second durations
  • Batch size for images: 1-4, for videos: 1
  • All API calls include automatic retry logic

Error Handling

The client handles common errors:

  • Network timeouts (with retry)
  • Rate limiting (with exponential backoff)
  • Invalid parameters (validation before API call)
  • Task failures (structured exceptions)

Exception Hierarchy

VivagoError (base)
├── VivagoAPIError
├── MissingCredentialError
├── InvalidPortError
├── ImageUploadError
├── TemplateNotFoundError
└── TaskError
    ├── TaskFailedError
    ├── TaskRejectedError
    └── TaskTimeoutError

Video Templates Reference

The following 181 video templates are available via template_to_video():

Quick Categories

CategoryCountExample Templates
Style Transfer20+ghibli, 1930s-2000s vintage styles
Harry Potter4magic_reveal_ravenclaw, gryffindor, hufflepuff, slytherin
Wings/Fantasy10+angel_wings, phoenix_wings, crystal_wings, fire_wings
Superheroes5+iron_man, cat_woman, ghost_rider
Dance10+apt, dadada, dance, limbo_dance
Effects15+ash_out, metallic_liquid, flash_flood
Thanksgiving10+turkey_chasing, autumn_feast, gratitude_photo
Comics/Cartoon8+gta_star, anime_figure, bring_comics_to_life
Products8+glasses_display, music_box, food_product_display
Scenes20+romantic_kiss, graduation, starship_chef

Popular Templates

Template IDDescription
ghibli / ghibli2Studio Ghibli animation style
magic_reveal_ravenclawHarry Potter Ravenclaw transformation
magic_reveal_gryffindorHarry Potter Gryffindor transformation
magic_reveal_hufflepuffHarry Potter Hufflepuff transformation
magic_reveal_slytherinHarry Potter Slytherin transformation
iron_manIron Man armor assembly
angel_wings / phoenix_wings / crystal_wings / fire_wingsWing transformations
cat_womanCat Woman style
ghost_riderGhost Rider flaming skull
jokerJoker villain style
mermaidMermaid underwater scene
snow_whiteSnow White princess
barbieBarbie princess transformation
me_in_handMiniature figure in hand
music_boxRotating figure on music box
anime_figureTransform into anime figure
gta_starGTA game style transformation
apt / dadada / danceDance templates
ash_outDisintegrate into ashes
eye_of_the_stormThunder god awakening
metallic_liquidMetal mask transformation
flash_floodWater/flood effect
turkey_chasing / turkey_away / turkey_giantThanksgiving turkey scenes
autumn_feast / autumn_strollAutumn scenes
renovation_of_old_photosColorize B&W photos
graduationGraduation ceremony
glasses / glasses_displayGlasses/eyewear showcase
bikini / sexy_man / sexy_pantsFashion/beach
romantic_kiss / boyfriends_rose / girlfriends_roseRomantic scenes
ai_archaeologist / starship_chef / cyber_cookerSci-fi characters
jungle_reign / panther_queen / roar_of_the_dustlands / tiger_snuggleAnimal companions
instant_sadness / headphone_vibe / relaxEmotion/reaction
frost_alertCold/freeze effect
bald_meBald transformation
boom_hair / curl_pop / long_hairHair transformations
musclesMuscle transformation
face_punch / gun_pointAction effects
static_shot / tracking_shot / orbit_shot / push_in / zoom_out / handheld_shotCamera movements
earth_zoom_in / earth_zoom_outEarth zoom effects

View All Templates

from scripts.template_manager import get_template_manager

manager = get_template_manager()
templates = manager.list_templates()

print(f"Total templates: {len(templates)}")
for tid, name in sorted(templates.items()):
    print(f"  {tid}: {name}")

Usage Example

from scripts import create_client

client = create_client()

# Upload image
image_uuid = client.upload_image("/path/to/photo.jpg")

# Apply Ghibli style template
results = client.template_to_video(
    image_uuid=image_uuid,
    template="ghibli",
    wh_ratio="9:16"
)

# Harry Potter transformation
results = client.template_to_video(
    image_uuid=image_uuid,
    template="magic_reveal_ravenclaw",
    wh_ratio="9:16"
)

Changelog

v0.9.0 (2026-03-09)

  • ✅ Code review complete (P0-P3)
  • ✅ Added GitHub Actions CI
  • ✅ Added type safety module (enums.py)
  • ✅ Added structured exceptions (exceptions.py)
  • ✅ Split configuration into modular files
  • ✅ Archived redundant code and tests
  • ✅ Pinned dependency versions

v0.8.2 (2026-03-08)

  • ✅ Template testing: 44 templates, 40 passed (90.9%)
  • ✅ Fixed metallic_liquid naming issue
  • ✅ Marked long_hair as deprecated

v0.8.0 (2026-03-07)

  • ✅ Completed Tier 1-4 testing
  • ✅ Established smart test optimization system

Comments

Loading comments...