Install
openclaw skills install samUse SAM (Segment Anything Model) to remove image backgrounds and extract foreground subjects as transparent PNGs. Use when users want to remove backgrounds,...
openclaw skills install samExtract foreground subjects from images using Meta's Segment Anything Model, outputting transparent PNGs.
python3 scripts/segment.py <input_image> <output.png>
Defaults to the image center as the foreground hint — works well for portraits and product shots where the subject is centered.
| Param | Description | Default |
|---|---|---|
input | Input image path | required |
output | Output PNG path (single mode) or directory (--all mode) | required |
--model | Model size: vit_b (fast) · vit_l (medium) · vit_h (best quality) | vit_h |
--checkpoint | Local checkpoint path; auto-downloaded if omitted | auto |
--points | Foreground hint points as x,y, multiple allowed | center |
--all | Grid-sweep mode: extract all distinct elements | off |
--grid | Grid density for --all; 16 means 16×16=256 probe points | 16 |
--iou-thresh | Minimum predicted IoU to accept a mask (--all) | 0.88 |
--min-area | Minimum mask area as fraction of image (--all) | 0.001 |
# Basic background removal (auto-downloads vit_h ~2.5GB)
python3 scripts/segment.py photo.jpg output.png
# Specify hint point when subject is off-center
python3 scripts/segment.py photo.jpg output.png --points 320,240
# Multiple hints with lightweight model
python3 scripts/segment.py photo.jpg output.png --model vit_b --points 320,240 400,300
# Extract all elements (one PNG per element)
python3 scripts/segment.py photo.jpg ./elements/ --all
# Denser grid to capture small objects
python3 scripts/segment.py photo.jpg ./elements/ --all --grid 32
# Use a local checkpoint
python3 scripts/segment.py photo.jpg output.png --checkpoint /path/to/sam_vit_h_4b8939.pth
segment_anything is auto-installed on first run, or install manually:
pip install git+https://github.com/facebookresearch/segment-anything.git
pip install pillow numpy torch torchvision
~/.cache/sam/| Model | Size | Speed | Quality |
|---|---|---|---|
vit_b | ~375 MB | fastest | good |
vit_l | ~1.25 GB | medium | better |
vit_h | ~2.5 GB | slower | best |
CUDA is used automatically when a GPU is available.
uv tool install pillowuv tool install numpyuv tool install torchuv tool install torchvision