Qwen Vision

v0.1.0

Analyze images and videos using Qwen Vision API (Alibaba Cloud DashScope). Supports image understanding, OCR, visual reasoning.

0· 908·3 current·3 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for perchouli/qwen-vision.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "Qwen Vision" (perchouli/qwen-vision) from ClawHub.
Skill page: https://clawhub.ai/perchouli/qwen-vision
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Required binaries: python3
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install qwen-vision

ClawHub CLI

Package manager switcher

npx clawhub@latest install qwen-vision
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description match the actual behavior: the included script base64-encodes an image and posts it to DashScope's Qwen Vision endpoint. The only required binary is python3, which is reasonable.
Instruction Scope
SKILL.md instructs running the shipped script and documents possible locations to obtain the API key (including ~/.openclaw/openclaw.json or DASHSCOPE_API_KEY). The runtime script itself, however, requires an --api-key argument and does not read those config paths or environment variables. The instructions do not direct reading unrelated files or exfiltrating system data.
Install Mechanism
No install spec or remote downloads are present; this is an instruction-only skill with a small included script. Nothing is fetched from external/untrusted URLs during install.
Credentials
The skill legitimately requires a DashScope API key. Registry metadata lists no required env vars, but SKILL.md suggests DASHSCOPE_API_KEY or ~/.openclaw config as sources; the script actually requires an explicit --api-key. This is an implementation/documentation mismatch but the credential requested is proportional to the skill's function.
Persistence & Privilege
Skill is not always-enabled, does not request persistent or elevated system privileges, and does not modify other skills or system-wide configuration.
Assessment
This skill sends your image (base64-encoded) and prompt to Alibaba Cloud DashScope (Qwen Vision) — you'll need a valid API key. Confirm you trust the external service before sending sensitive images. Note a small mismatch: documentation suggests the key can be read from ~/.openclaw/openclaw.json or DASHSCOPE_API_KEY, but the bundled script requires --api-key (it doesn't auto-read those locations). If you want the convenience of env/config lookup, verify or modify the script yourself. Otherwise, review where you store the API key and avoid passing it in shells or logs if you need secrecy. Lastly, since the skill transmits image data externally, ensure this aligns with your privacy/compliance requirements.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

👁️ Clawdis
Binspython3
latestvk978drfajk46fc89ds4bfcjyzd83fdan
908downloads
0stars
1versions
Updated 1mo ago
v0.1.0
MIT-0

Qwen Vision

Analyze images and videos using Alibaba Cloud's Qwen Vision API (通义千问视觉模型).

Usage

Analyze an image:

uv run {baseDir}/scripts/analyze_image.py --image "/path/to/image.jpg" --prompt "请描述这张图片" --api-key sk-xxx

With custom model:

uv run {baseDir}/scripts/analyze_image.py --image "/path/to/image.jpg" --model qwen-vl-max-latest --api-key sk-xxx

API Key

Get your API key from:

  • models.providers.bailian.apiKey in ~/.openclaw/openclaw.json
  • Or skills."qwen-image".apiKey in ~/.openclaw/openclaw.json
  • Or DASHSCOPE_API_KEY environment variable
  • Or https://dashscope.console.aliyun.com/

Models

ModelDescription
qwen-vl-max-latestLatest max model (default)
qwen-vl-plus-latestFaster, cost-effective

Prompt Examples

TaskPrompt
Describe"请详细描述这张图片的内容"
OCR"提取图片中的所有文字"
Count"数一下图中有多少个物体"
Analyze"分析这张图表的数据趋势"
Identify"这是什么地方/物品?"

Notes

  • Supports JPG, PNG, GIF, WebP, BMP formats
  • Images are encoded as base64 and sent via API
  • Response time varies by image size and complexity

Comments

Loading comments...