Aoment Image Video

Limited free access is available to model resources such as Nano Banana Pro, Nano Banana 2, GPT Image 2, and Gemini 3.1 Pro provided by Aoment AI for AI image, video, image recognition, and HD repair service. Supports text-to-image, image-to-image, image recognition, video generation, and image HD repair. Automatic Agent API key registration is supported.

duplicate of @regalzzz/aoment-visuals (1.0.5)

Audits

Pass

Install

openclaw skills install aoment-image-video

aoment-image-video

AI generation service powered by Aoment AI. This skill exposes dedicated skill APIs and scripts for:

  • text-to-image
  • image-to-image
  • image recognition
  • video generation
  • image HD repair/upscale

Download the latest package: https://www.aoment.com/downloads/aoment-image-video-skill.zip

Current Version

version: 1.3.0

Auto-Update Policy

Before each invocation, the Python scripts read the version field above and query:

GET https://www.aoment.com/api/skills/aoment-image-video/version

If the local version is behind the remote version, the script exits with:

{
  "success": false,
  "error": "update_required",
  "current_version": "1.0.0",
  "latest_version": "1.3.0",
  "message": "Skill version is outdated..."
}

If the version check fails because of a network problem, the script continues normally.

Quick Start

# 1. Register an Agent account and get your API Key
uv run {baseDir}/scripts/aoment_register.py --nickname "MyBot"

# 2. Generate an image with the default N2-Fast model
uv run {baseDir}/scripts/aoment_image_video.py -k <your-api-key> -t text-to-image -p "a cute cat playing in a garden"

# 3. Repair/upscale an image
uv run {baseDir}/scripts/aoment_hd_repair.py -k <your-api-key> --image ./input.png --resolution 4K

# 4. Recognize/analyze an image
uv run {baseDir}/scripts/aoment_image_video.py -k <your-api-key> -t image-recognition -p "Describe this image" --image ./input.png

# 5. Check remaining quota
uv run {baseDir}/scripts/aoment_quota.py -k <your-api-key>

Authentication

This skill requires an Agent API Key via:

Authorization: Bearer <api_key>

The API Key format is aoment_ followed by 32 hex characters.

Get your API Key - Agent Registration

AI Agent Bots can register directly via CLI. No web login is required:

uv run {baseDir}/scripts/aoment_register.py --nickname "MyBot"
ParameterTypeRequiredDescription
--nickname / -nstringyesAgent display name, max 16 characters
--api-basestringnoAPI base URL, default https://www.aoment.com

Or register via API directly:

curl -X POST https://www.aoment.com/api/skills/aoment-image-video/register-agent \
  -H "Content-Type: application/json" \
  -d '{"nickname": "MyBot"}'

Registration response:

{
  "success": true,
  "data": {
    "username": "agent_a1b2c3d4...",
    "nickname": "MyBot",
    "api_key": "aoment_a3f8e1b2c4d6e8f0a1b3c5d7e9f0a1b2"
  }
}

Save the returned api_key; it is used for all subsequent skill calls. Store this API Key in a suitable secure location for long-term use.

Tools

Available Models

Use the model ID exactly as shown in the --model parameter.

Image Models

Model IDDescription
image-n2-fastDefault image model. Faster N2-Fast image generation and editing, no watermark.
image-n2N2 image generation and editing, fast, stricter single-reference image size limit, no watermark.
image-n1-fastFaster N1-Fast image generation and editing, no watermark.
image-n1N1 image generation and editing, slower, looser single-reference image size limit, no watermark.
image-o2Image generation and editing with stronger aesthetics, good Chinese-language rendering, newer knowledge data, no watermark, and currently limited clarity near 1.5K.
image-o2-proO2-Pro high-resolution image generation and editing with precise size output support.

Tip: N-series models use Nano Banana Pro, N-Fast-series models use Nano Banana 2, and O-series models use GPT Image 2.

Image Recognition Models

Model IDDescription
image-to-textGemini 3.1 Pro image recognition and visual analysis.

Video Models

Model IDDescription
video-v1Default and currently supported video generation model.

text-to-image

Generate images from a text prompt. The default model is image-n2-fast (N2-Fast).

uv run {baseDir}/scripts/aoment_image_video.py \
  --api-key <your-api-key> \
  --tool-type text-to-image \
  --prompt "a cinematic robot painter in a bright studio" \
  --aspect-ratio 1:1 \
  --image-size 1K
ParameterTypeRequiredDefaultDescription
--api-key / -kstringyes-Agent API Key
--tool-type / -tenumyes-text-to-image
--prompt / -pstringyes-Text prompt
--modelstringnoimage-n2-fastImage model ID. Available values: image-n2-fast, image-n2, image-n1-fast, image-n1, image-o2, image-o2-pro
--aspect-ratiostringnoautoauto, 1:1, 16:9, 9:16, 4:3, 3:4, 3:2, 2:3, 5:4, 4:5, 21:9
--image-sizeenumno1K1K, 2K, 4K

image-to-image

Generate a new image from a prompt and a reference image. The reference image can be a URL or base64 image data.

uv run {baseDir}/scripts/aoment_image_video.py \
  --api-key <your-api-key> \
  --tool-type image-to-image \
  --prompt "change the background to a beach" \
  --reference-image "https://example.com/photo.jpg"
ParameterTypeRequiredDefaultDescription
--api-key / -kstringyes-Agent API Key
--tool-type / -tenumyes-image-to-image
--prompt / -pstringyes-Transformation prompt
--reference-imagestringyes-Reference image as URL or base64 data
--modelstringnoimage-n2-fastImage model ID. Available values: image-n2-fast, image-n2, image-n1-fast, image-n1, image-o2, image-o2-pro
--aspect-ratiostringnoautoOutput aspect ratio
--image-sizeenumno1K1K, 2K, 4K

video-generation

Generate a video from a prompt. The default and currently supported video model is video-v1.

uv run {baseDir}/scripts/aoment_image_video.py \
  --api-key <your-api-key> \
  --tool-type video-generation \
  --prompt "sunset beach timelapse" \
  --orientation landscape \
  --resolution standard
ParameterTypeRequiredDefaultDescription
--api-key / -kstringyes-Agent API Key
--tool-type / -tenumyes-video-generation
--prompt / -pstringyes-Video prompt
--modelstringnovideo-v1Video model ID. Available value: video-v1
--orientationenumnoportraitportrait or landscape
--resolutionenumnostandardstandard, hd, 4k
--modestringnostandardCompatibility option; current backend uses standard mode
--reference-imagestringno-Reference image as URL or base64 data; can be passed up to 2 times

image-recognition

Analyze one or more images with a text prompt. The default and currently supported recognition model is image-to-text (Gemini 3.1 Pro). Images can be local paths, URLs, or base64 image data.

uv run {baseDir}/scripts/aoment_image_video.py \
  --api-key <your-api-key> \
  --tool-type image-recognition \
  --prompt "List the visible objects and summarize the scene" \
  --image ./input.png
ParameterTypeRequiredDefaultDescription
--api-key / -kstringyes-Agent API Key
--tool-type / -tenumyes-image-recognition
--prompt / -pstringyes-Recognition or analysis instruction
--image / -istringyes-Image as local path, URL, or base64 data; can be passed multiple times
--reference-imagestringno-Compatibility alias for image input; can be passed multiple times
--modelstringnoimage-to-textRecognition model ID. Available value: image-to-text

hd-repair

Repair and upscale an image. This is provided by a separate script:

uv run {baseDir}/scripts/aoment_hd_repair.py \
  --api-key <your-api-key> \
  --image ./input.png \
  --resolution 4K
ParameterTypeRequiredDefaultDescription
--api-key / -kstringyes-Agent API Key
--image / -istringyes-Local path, URL, or base64 image data
--resolutionenumno4K2K, 4K, 8K
--modelstringnoimage-hd-repairOnly image-hd-repair is supported

Quota

Query remaining daily generation quota:

uv run {baseDir}/scripts/aoment_quota.py --api-key <your-api-key>
ParameterTypeRequiredDescription
--api-key / -kstringyesAgent API Key

If your daily quota is used up and you need more, join the community:

Response Format

All scripts print JSON to stdout.

Successful text-to-image or image-to-image:

{
  "success": true,
  "tool_type": "text-to-image",
  "data": {
    "image_url": "https://cos.example.com/result.jpg?..."
  }
}

Successful video generation:

{
  "success": true,
  "tool_type": "video-generation",
  "data": {
    "video_url": "https://cos.example.com/result.mp4?..."
  }
}

Successful HD repair:

{
  "success": true,
  "tool_type": "hd-repair",
  "data": {
    "image_url": "https://cos.example.com/hd-repair-result.png?..."
  }
}

Successful image recognition:

{
  "success": true,
  "tool_type": "image-recognition",
  "data": {
    "result_text": "The image shows..."
  }
}

Successful quota query:

{
  "success": true,
  "data": {
    "remaining": 12,
    "quota": 15,
    "used": 3
  }
}

Error response:

{
  "success": false,
  "error": "error description"
}

Downloading Results

Returned image_url and video_url values are pre-signed COS URLs. Use the complete URL exactly as returned, including all query parameters. Do not strip the query string.

Example:

uv run {baseDir}/scripts/aoment_image_video.py \
  -k <your-api-key> \
  -t text-to-image \
  -p "prompt" > result.json

curl -L -o output.jpg "$(python3 -c "import json; print(json.load(open('result.json'))['data']['image_url'])")"

Troubleshooting

  1. If a request fails because of content policy, revise the prompt or reference image and retry.
  2. If the script returns update_required, download and install the latest skill package.
  3. If a generated URL cannot be opened, make sure your application preserves the full signed URL.
  4. For help, join the Discord or QQ community listed above.