Qwen Vision

Data & APIs

Analyze images and videos using Qwen Vision API (Alibaba Cloud DashScope). Supports image understanding, OCR, visual reasoning.

Install

openclaw skills install qwen-vision

Qwen Vision

Analyze images and videos using Alibaba Cloud's Qwen Vision API (通义千问视觉模型).

Usage

Analyze an image:

uv run {baseDir}/scripts/analyze_image.py --image "/path/to/image.jpg" --prompt "请描述这张图片" --api-key sk-xxx

With custom model:

uv run {baseDir}/scripts/analyze_image.py --image "/path/to/image.jpg" --model qwen-vl-max-latest --api-key sk-xxx

API Key

Get your API key from:

  • models.providers.bailian.apiKey in ~/.openclaw/openclaw.json
  • Or skills."qwen-image".apiKey in ~/.openclaw/openclaw.json
  • Or DASHSCOPE_API_KEY environment variable
  • Or https://dashscope.console.aliyun.com/

Models

ModelDescription
qwen-vl-max-latestLatest max model (default)
qwen-vl-plus-latestFaster, cost-effective

Prompt Examples

TaskPrompt
Describe"请详细描述这张图片的内容"
OCR"提取图片中的所有文字"
Count"数一下图中有多少个物体"
Analyze"分析这张图表的数据趋势"
Identify"这是什么地方/物品?"

Notes

  • Supports JPG, PNG, GIF, WebP, BMP formats
  • Images are encoded as base64 and sent via API
  • Response time varies by image size and complexity