Install
openclaw skills install visual-understanding智谱 GLM-4.6V 多模态视觉模型集成插件。支持本地图像解析(Base64)及公网链接读取。优先提供 zai SDK 接入,并包含 cURL 原生降级方案。
openclaw skills install visual-understanding本 Skill 为开发者提供接入智谱 GLM-4.6V 视觉大模型的能力,支持精准的图像内容描述、多图对比及信息提取。
ZHIPUAI_API_KEY 读取。适用场景:已安装 Python 环境,且需要处理本地图片(通过 Base64 编码上传)。此方式最稳定且支持高级应用封装。
pip install zai
import os
import base64
from zai import ZhipuAiClient
# 安全规范:通过环境变量读取凭据
client = ZhipuAiClient(api_key=os.environ.get("ZHIPUAI_API_KEY"))
def encode_image(image_path):
"""将本地图像编码为 base64 格式"""
with open(image_path, 'rb') as image_file:
return base64.b64encode(image_file.read()).decode('utf-8')
# ==========================================
# 场景 A:使用公网图像 URL
# ==========================================
response_url = client.chat.completions.create(
model="glm-4.6v",
messages=[{
"role": "user",
"content": [
{"type": "text", "text": "这张图片里有什么?请详细描述。"},
{"type": "image_url", "image_url": {"url": "[https://example.com/image.jpg](https://example.com/image.jpg)"}}
]
}]
)
print("URL 解析结果:", response_url.choices[0].message.content)
# ==========================================
# 场景 B:使用本地图片 (Base64)
# ==========================================
local_image_path = 'path/to/your/image.jpg'
if os.path.exists(local_image_path):
base64_image = encode_image(local_image_path)
response_base64 = client.chat.completions.create(
model="glm-4.6v",
messages=[{
"role": "user",
"content": [
{"type": "text", "text": "分析这张图片中的内容"},
{"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{base64_image}"}}
]
}]
)
print("本地图片解析结果:", response_base64.choices[0].message.content)
适用场景:受限环境(如 CI/CD 管道、轻量级容器),无法安装 zai SDK。
请在终端中执行,系统将自动读取已配置的 $ZHIPUAI_API_KEY 环境变量:
curl --request POST \
--url [https://open.bigmodel.cn/api/paas/v4/chat/completions](https://open.bigmodel.cn/api/paas/v4/chat/completions) \
--header "Authorization: Bearer $ZHIPUAI_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"model": "glm-4.6v",
"messages": [
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "[https://cdn.bigmodel.cn/static/logo/register.png](https://cdn.bigmodel.cn/static/logo/register.png)"
}
},
{
"type": "image_url",
"image_url": {
"url": "[https://cdn.bigmodel.cn/static/logo/api-key.png](https://cdn.bigmodel.cn/static/logo/api-key.png)"
}
},
{
"type": "text",
"text": "What are the pics talk about?"
}
]
}
]
}'
| 请求方式 | 优点 | 局限性 |
|---|---|---|
| zai SDK | 支持本地图片、易于与 RAG 或 Agent 工作流集成 | 需要 Python 环境及 pip install 权限 |
| cURL | 零依赖,随处可用,非常适合自动化 Shell 脚本 | 只能读取公网图床,本地图片需自行搭建图床中转 |