AI Image & Video Toolkit — Free Upscale, Face Enhance, BG Remove & Generation

Free local AI image and video processing toolkit with cloud AI generation. Local tools: upscale (Real-ESRGAN), face enhance (GFPGAN/CodeFormer), background r...

MIT-0 · Free to use, modify, and redistribute. No attribution required.
0 · 34 · 0 current installs · 0 all-time installs
byMikeWang@xixihhhh
MIT-0
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description match the included scripts and requirements. uv is required because scripts are executed via `uv run`; ffmpeg is required for media processing; ATLAS_CLOUD_API_KEY is required only for the cloud generation script. The declared primary credential and binaries align with the described functionality.
Instruction Scope
SKILL.md and the scripts stay within the image/video processing/generation scope. However, several local scripts will download model weights or call external endpoints at runtime (e.g., huggingface URL in face-swap, rembg/new_session, and ai-generate contacting api.atlascloud.ai). SKILL.md's claim that "Local tools run 100% on your machine" is true regarding API usage but local tools still perform network downloads of pretrained model files and install dependencies when first run.
Install Mechanism
No explicit install spec is provided (instruction-only install), which is low risk, but `uv run` will auto-install Python dependencies and those packages may themselves fetch model weights and execute arbitrary package code. The code downloads model artifacts from known hosts (Hugging Face and Atlas Cloud); no obscure or shortened URLs were found.
Credentials
Only ATLAS_CLOUD_API_KEY (and optional ATLASCLOUD_API_KEY fallback) is requested and it's appropriate for the Atlas Cloud generation feature. Scripts look for a local .env fallback but do not request unrelated credentials or system secrets.
Persistence & Privilege
The skill does not request always:true or other elevated persistence. It writes model files to expected locations (e.g., ~/.insightface/models) and output folders but does not attempt to modify other skills or global agent configuration. Included .claude/settings.local.json only grants WebSearch permission for convenience.
Assessment
This skill appears to do what it claims, but consider the following before installing: - The cloud-generation feature will send prompts and (optionally) images to Atlas Cloud; do not use it with sensitive images or proprietary prompts unless you trust that service and key. - Running any script will cause `uv` to auto-install Python packages and the packages/models may download large pretrained weights (Hugging Face, rembg, etc.) into your home or project directories — expect significant disk and network activity. - The face-swap/face-enhance tools enable realistic edits (deepfakes). Be mindful of legal and ethical implications before using them on others' images. - The scripts download model files (e.g., inswapper from Hugging Face) at runtime; if you need an air-gapped or fully-offline setup, inspect and pre-download/verify model artifacts before running. - Only provide ATLAS_CLOUD_API_KEY if you intend to use cloud generation; treat the key like any API secret and avoid storing it in shared repos or exposing it to untrusted environments. If you want a higher-assurance review, ask for a line-by-line audit of any specific script (for example scripts/ai-generate.py and scripts/face-swap.py) or vendor verification of the Atlas Cloud endpoints.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.0.3
Download zip
latestvk970s6pq7jrhe417nebv6k4y418307xd

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Runtime requirements

Binsuv, ffmpeg
EnvATLAS_CLOUD_API_KEY
Primary envATLAS_CLOUD_API_KEY

SKILL.md

Free Image & Video Processing Toolkit

7 free local AI tools + cloud AI generation (300+ models via Atlas Cloud API).

Local tools run 100% on your machine — no API keys, no cloud costs. Cloud generation tools provide access to state-of-the-art AI models for image and video creation.

Prerequisites

  • Python 3.10+ installed
  • uv installed (brew install uv / pip install uv / winget install astral-sh.uv)
  • FFmpeg installed (brew install ffmpeg / apt install ffmpeg / winget install ffmpeg)

Available Tools

ToolScriptWhat It Does
Image Upscalescripts/upscale.py2x/4x super resolution using Real-ESRGAN
Face Enhancescripts/face-enhance.pyRestore and enhance faces using GFPGAN + CodeFormer
Background Removescripts/bg-remove.pyRemove image backgrounds, output transparent PNG
Object Erasescripts/erase.pyErase unwanted objects using LaMa inpainting
Face Swapscripts/face-swap.pySwap faces between images using InsightFace
Smart Segmentscripts/segment.pySegment anything in images using FastSAM
Media Processscripts/media-process.pyConvert, compress, resize, extract with FFmpeg
AI Generatescripts/ai-generate.pyGenerate images/videos with 300+ cloud AI models

Usage

All scripts use uv run for zero-setup execution — dependencies are automatically installed on first run.

Image Upscale (Real-ESRGAN)

Upscale low-resolution images by 2x or 4x with AI super resolution.

# 4x upscale (default)
uv run scripts/upscale.py input.jpg

# 2x upscale
uv run scripts/upscale.py input.jpg --scale 2

# Upscale with face enhancement
uv run scripts/upscale.py input.jpg --face-enhance

# Batch upscale a folder
uv run scripts/upscale.py ./photos/ --scale 4

# Custom output path
uv run scripts/upscale.py input.jpg -o upscaled.png

Face Enhance (GFPGAN + CodeFormer)

Restore old photos, enhance blurry faces, fix low-quality portraits.

# Enhance faces in an image (GFPGAN, default)
uv run scripts/face-enhance.py photo.jpg

# Use CodeFormer (better fidelity control)
uv run scripts/face-enhance.py photo.jpg --method codeformer

# Adjust fidelity (0=quality, 1=fidelity, default 0.5)
uv run scripts/face-enhance.py photo.jpg --method codeformer --fidelity 0.7

# Also upscale background (2x)
uv run scripts/face-enhance.py photo.jpg --bg-upscale 2

# Batch process
uv run scripts/face-enhance.py ./old-photos/

Background Remove (rembg)

Remove backgrounds from images, output transparent PNG. Supports multiple AI models.

# Remove background (default u2net model)
uv run scripts/bg-remove.py product.jpg

# Use specific model
uv run scripts/bg-remove.py photo.jpg --model isnet-general-use

# Batch process folder
uv run scripts/bg-remove.py ./products/ -o ./transparent/

# Keep only the foreground (alpha matting for fine edges)
uv run scripts/bg-remove.py portrait.jpg --alpha-matting

# Available models: u2net, u2netp, u2net_human_seg, u2net_cloth_seg,
#                   silueta, isnet-general-use, isnet-anime, sam

Object Erase (LaMa Inpainting)

Remove unwanted objects from images using a mask.

# Erase objects (white area in mask = erase)
uv run scripts/erase.py image.png --mask mask.png

# Auto-generate mask from coordinates (x,y,width,height)
uv run scripts/erase.py image.png --region 100,200,150,150

# Batch erase with matching masks (image1.png + image1_mask.png)
uv run scripts/erase.py ./images/ --mask-dir ./masks/

Face Swap (InsightFace)

Swap faces between two images.

# Swap face from source to target
uv run scripts/face-swap.py --source face.jpg --target photo.jpg

# Swap specific face index (when multiple faces detected)
uv run scripts/face-swap.py --source face.jpg --target group.jpg --face-index 0

# Custom output
uv run scripts/face-swap.py --source face.jpg --target photo.jpg -o result.png

Smart Segment (FastSAM)

Segment any object in an image using text prompt, point, or bounding box.

# Segment everything
uv run scripts/segment.py image.jpg

# Segment by text prompt
uv run scripts/segment.py image.jpg --text "the dog"

# Segment by point (x, y)
uv run scripts/segment.py image.jpg --point 400,300

# Segment by bounding box (x1,y1,x2,y2)
uv run scripts/segment.py image.jpg --box 100,100,400,400

# Output mask only
uv run scripts/segment.py image.jpg --text "car" --mask-only

Media Process (FFmpeg)

Convert, compress, resize, extract frames, merge audio/video — powered by FFmpeg.

# Convert format
uv run scripts/media-process.py convert input.mp4 output.webm

# Compress video (target size in MB)
uv run scripts/media-process.py compress input.mp4 --target-size 10

# Resize video
uv run scripts/media-process.py resize input.mp4 --width 1280 --height 720

# Extract frames as images
uv run scripts/media-process.py frames input.mp4 --fps 1 --output ./frames/

# Extract audio
uv run scripts/media-process.py audio input.mp4 -o audio.mp3

# Create GIF from video
uv run scripts/media-process.py gif input.mp4 --start 5 --duration 3 --fps 15

# Trim video
uv run scripts/media-process.py trim input.mp4 --start 00:01:00 --end 00:02:30

# Merge multiple videos
uv run scripts/media-process.py merge video1.mp4 video2.mp4 video3.mp4 -o combined.mp4

# Add watermark
uv run scripts/media-process.py watermark input.mp4 --image logo.png --position bottom-right

# Get media info
uv run scripts/media-process.py info input.mp4

AI Generate (Atlas Cloud API) — Gold Sponsor

Generate images and videos using 300+ state-of-the-art AI models. Requires an Atlas Cloud API key.

IMPORTANT for AI agents: Before calling this script, you MUST first use Atlas Cloud MCP tools to find the correct model ID and its required parameters:

  1. Call atlas_list_models to browse available models, or atlas_search_docs to search for a specific model
  2. Call atlas_get_model_info with the model ID to get the exact parameter schema (different models use different parameters — some use size, others use aspect_ratio + resolution, etc.)
  3. Then call the script with --model <full_model_id> and the correct parameters
# Generate image (pass full model ID from Atlas Cloud)
uv run scripts/ai-generate.py image "A cat astronaut on the moon" --model black-forest-labs/flux-schnell --size 1024*1024

# Models using aspect_ratio + resolution (e.g. Nano Banana 2, Imagen4)
uv run scripts/ai-generate.py image "Anime girl with blue hair" --model google/nano-banana-2/text-to-image --aspect-ratio 1:1 --resolution 1k

# Models using size presets (e.g. Seedream)
uv run scripts/ai-generate.py image "Product photo on marble" --model bytedance/seedream-v5.0-lite --size 2048*2048

# Edit existing image
uv run scripts/ai-generate.py image "Make the sky sunset orange" --model bytedance/seedream-v5.0-lite/edit --image photo.jpg

# Generate video
uv run scripts/ai-generate.py video "Timelapse of cherry blossoms" --model alibaba/wan-2.6/text-to-video --size 1280*720

# Image-to-video
uv run scripts/ai-generate.py video "The person starts walking" --model alibaba/wan-2.6/image-to-video --image portrait.jpg

# Pass extra model-specific parameters as JSON
uv run scripts/ai-generate.py image "A logo" --model google/imagen4-ultra --extra '{"num_images": 4}'

# NSFW mode
uv run scripts/ai-generate.py image "Artistic figure study" --model black-forest-labs/flux-dev-lora --nsfw

Setup: Set ATLAS_CLOUD_API_KEY in environment variable or project .env file. Get your key at atlascloud.ai. Note: when using cloud generation, your prompts and image data will be sent to the Atlas Cloud API for processing.

Output

All tools save output to ./output/ by default. Use -o or --output to specify a custom path.

Models

Models are automatically downloaded on first use and cached locally:

ToolModelSizeCache Location
UpscaleRealESRGAN_x4plus~64MB~/.cache/realesrgan/
Face EnhanceGFPGANv1.4~348MB~/.cache/gfpgan/
Face EnhanceCodeFormer~376MB~/.cache/codeformer/
Background Removeu2net~176MB~/.u2net/
Object EraseLaMa~200MB~/.cache/lama/
Face Swapbuffalo_l + inswapper~500MB~/.insightface/
Smart SegmentFastSAM-s~23MBauto-downloaded by ultralytics

Total first-run download: ~1.5GB. All subsequent runs use cached models.

Tips

  • GPU Acceleration: All tools automatically use CUDA/MPS if available, falling back to CPU
  • Batch Processing: Most tools accept a folder path for batch processing
  • Memory: Face swap and segmentation may need 4GB+ RAM for large images
  • First Run: First execution downloads AI models — subsequent runs are instant

Workflow Examples

Combine local processing with cloud AI generation:

# 1. Generate a product image with AI
uv run scripts/ai-generate.py image "Minimalist perfume bottle, studio lighting" --model bytedance/seedream-v5.0-lite --size 2048*2048

# 2. Upscale to 4x resolution
uv run scripts/upscale.py ./output/seedream-v5.0-lite_*.png --scale 4

# 3. Remove background for e-commerce
uv run scripts/bg-remove.py ./output/*_x4.png --alpha-matting

# 4. Generate a product video
uv run scripts/ai-generate.py video "A perfume bottle rotating slowly" --model kwaivgi/kling-v3.0-pro/text-to-video --duration 5

# 5. Add watermark to the video
uv run scripts/media-process.py watermark ./output/text-to-video_*.mp4 --image logo.png

Files

10 total
Select a file
Select a file to preview.

Comments

Loading comments…