AB Agents Vision ๐๏ธ
Image analysis using MiniMax VL API โ simple, fast, reliable.
What It Does
- ๐ธ Describe images โ Get detailed scene descriptions
- ๐ Extract text โ Read text from screenshots, photos, documents
- ๐ Analyze photos โ Identify objects, people, settings
- ๐ URL support โ Analyze images from the web
Quick Start
# Install
curl -LsSf https://astral.sh/uv/install.sh | sh
# Set your MiniMax API key
export MINIMAX_API_KEY="sk-cp-your-key"
# Use
./vision.sh image.jpg "Describe this image"
Usage
# Basic description
./vision.sh photo.jpg
# With custom prompt
./vision.sh screenshot.png "What text do you see?"
# URL support
./vision.sh "https://example.com/image.jpg" "Describe this"
Requirements
- MiniMax Token Plan API key (get one)
- Linux/macOS
uvx (auto-installed via script)
Examples
Screenshot analysis:
Input: screenshot.png + "What text is in the image?"
Output: "The screenshot shows a code editor with Python code...
Photo description:
Input: photo.jpg + "Describe in detail"
Output: "A person's bare foot and lower leg resting on a brown
textured waffle-weave blanket. The skin is light-toned with
visible fine hairs..."
Installation
git clone https://github.com/alexburrstudio/ab-agents-skills.git
cd ab-agents-skills/skills/vision
chmod +x vision.sh
Or via ClaWHub:
clawhub install AB-Agents-Vision
Troubleshooting
| Error | Solution |
|---|
| API Error: 1033 | Retry โ system error on MiniMax side |
| No response | Check MINIMAX_API_KEY is set correctly |
| Slow | Use smaller images (<10MB) |
AB-Agents ๐ฆ