Install
openclaw skills install @kawummuwe-stack/vision-analyzerAnalyze images using Ollama Cloud's Kimi K2.5 vision capabilities. Use when user wants to describe, understand, or get information about an image. Works with local image files, screenshots, or downloaded images. Supports JPG, PNG, GIF, WebP formats.
openclaw skills install @kawummuwe-stack/vision-analyzerAnalyze images using Kimi K2.5 multimodal vision capabilities through Ollama Cloud API.
python3 ~/.openclaw/workspace/skills/vision-analyzer/scripts/vision_analyze.py <image_path> [prompt]
Describe an image:
python3 ~/.openclaw/workspace/skills/vision-analyzer/scripts/vision_analyze.py photo.jpg
Ask specific question:
python3 ~/.openclaw/workspace/skills/vision-analyzer/scripts/vision_analyze.py screenshot.png "What UI elements do you see?"
/mnt/chromeos/MyFiles/Downloads//mnt/chromeos/MyFiles/Downloads/~/Set your Ollama API key as environment variable:
export OLLAMA_API_KEY="your-api-key-here"
Get your API key from ollama.com/settings
The skill uses Ollama Cloud API with Kimi K2.5 model.
API key is read from OLLAMA_API_KEY environment variable.
Returns a natural language description of the image content.