Segment Anything

Creative

Use SAM (Segment Anything Model) to remove image backgrounds and extract foreground subjects as transparent PNGs. Use when users want to remove backgrounds, cut out objects, extract foreground subjects, or perform image segmentation.

Install

openclaw skills install @scikkk/sam

SAM Background Removal

Extract foreground subjects from images using Meta's Segment Anything Model, outputting transparent PNGs.

Quick Start

python3 scripts/segment.py <input_image> <output.png>

Defaults to the image center as the foreground hint — works well for portraits and product shots where the subject is centered.

Parameters

ParamDescriptionDefault
inputInput image pathrequired
outputOutput PNG path (single mode) or directory (--all mode)required
--modelModel size: vit_b (fast) · vit_l (medium) · vit_h (best quality)vit_h
--checkpointLocal checkpoint path; auto-downloaded if omittedauto
--pointsForeground hint points as x,y, multiple allowedcenter
--allGrid-sweep mode: extract all distinct elementsoff
--gridGrid density for --all; 16 means 16×16=256 probe points16
--iou-threshMinimum predicted IoU to accept a mask (--all)0.88
--min-areaMinimum mask area as fraction of image (--all)0.001

Examples

# Basic background removal (auto-downloads vit_h ~2.5GB)
python3 scripts/segment.py photo.jpg output.png

# Specify hint point when subject is off-center
python3 scripts/segment.py photo.jpg output.png --points 320,240

# Multiple hints with lightweight model
python3 scripts/segment.py photo.jpg output.png --model vit_b --points 320,240 400,300

# Extract all elements (one PNG per element)
python3 scripts/segment.py photo.jpg ./elements/ --all

# Denser grid to capture small objects
python3 scripts/segment.py photo.jpg ./elements/ --all --grid 32

# Use a local checkpoint
python3 scripts/segment.py photo.jpg output.png --checkpoint /path/to/sam_vit_h_4b8939.pth

Dependencies

segment_anything is auto-installed on first run, or install manually:

pip install git+https://github.com/facebookresearch/segment-anything.git
pip install pillow numpy torch torchvision

Workflow

  1. User provides image path
  2. Ask if hint points are needed (when subject is off-center)
  3. Run script; checkpoint auto-downloads on first use to ~/.cache/sam/
  4. Output transparent-background PNG

Model Selection

ModelSizeSpeedQuality
vit_b~375 MBfastestgood
vit_l~1.25 GBmediumbetter
vit_h~2.5 GBslowerbest

CUDA is used automatically when a GPU is available.