Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

glm-v-model

v1.0.1

智谱 GLM-4V/4.6V 视觉模型调用技能。用于图像/视频理解、多模态对话、图表分析等任务。 当用户提到:图片理解、图像识别、视觉模型、GLM-4V、GLM-4.6V、多模态分析、看图说话、图表分析、视频理解时使用此技能。

1· 513·1 current·1 all-time
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
The name, description, SKILL.md examples, and the Python helper all target calling Zhipu/GLM-4V/4.6V visual models (image/video understanding). Requiring an API key to call an external model provider is expected. However, the registry metadata declares no required environment variables while both the SKILL.md and the script state an API key is needed (ZHIPU_API_KEY). This mismatch between declared requirements and actual use is a discrepancy to resolve.
Instruction Scope
Instructions direct the agent to read local image files or URLs and send them to the GLM service, which matches the stated functionality. Concerns: (1) SKILL.md contains an example that appends an absolute, user-specific path (/Users/guobaokui/...) to sys.path — this is an unsafe, non-portable example and unnecessary. (2) The provided script's expected input is ambiguous/buggy: it expects objects with .read() for local images but the SKILL.md example calls glm_v(['image.jpg'], ...) (a filename string), which will break. (3) The skill will transmit image data to a third-party API (Zhipu); that is expected but privacy-sensitive.
Install Mechanism
No install spec is included (instruction-only plus a helper script). Comments suggest installing the 'zai-sdk' via pip — a normal, low-risk package manager step. No downloads from arbitrary URLs or extract steps are present.
!
Credentials
The code reads ZHIPU_API_KEY from the environment to authenticate to the external service, which is proportionate to the skill's purpose. The concern is that the skill's registry metadata does not list this required environment variable (or any primary credential). The missing declaration is misleading and could cause users to overlook the need to provide credentials and to recognize that data will be sent to a third party.
Persistence & Privilege
The skill is not marked always:true, does not request system-wide config paths, and does not modify other skills. It runs as an invoked skill and requires an external API key — no excessive persistence or elevated privileges are requested.
What to consider before installing
This skill appears to do what it claims (call Zhipu GLM visual models), but there are several things to check before installing or using it: - Expect that the skill needs an API key (ZHIPU_API_KEY) even though the registry metadata doesn't list it. Provide the key only if you trust the Zhipu/bigmodel.cn service and understand their data handling. - Images (and possibly video) will be transmitted to a third-party API. Do not send sensitive or private images unless you are comfortable with that provider's privacy/retention policy. - The included helper script and examples contain issues: a hardcoded user path in an example, and a likely bug where the script calls img.read() but SKILL.md suggests passing filenames. Treat the script as untrusted code and inspect/modify it before running. - The SDK (zai-sdk) is installed via pip per the comments. Review the package source/version (e.g., on PyPI or the vendor site) before installing to ensure it's legitimate. Recommended actions: ask the publisher to update the registry metadata to list ZHIPU_API_KEY (and any other required env vars), remove or fix hardcoded paths/examples, and correct the script's file-handling behavior. If you cannot verify the publisher/SDK, avoid sending private images or run the code in an isolated environment.

Like a lobster shell, security has layers — review code before you run it.

latestvk97aymhxggbtsp73k6n58ghzrn82mzvj
513downloads
1stars
2versions
Updated 7h ago
v1.0.1
MIT-0

GLM 视觉模型调用

本技能提供调用智谱 AI 的 GLM-4V 和 GLM-4.6V 视觉模型的能力,支持图像理解、视频分析、图表解读等功能。

支持的模型

模型说明特点
glm-4vGLM-4 视觉模型基础视觉理解
glm-4.6vGLM-4.6V 视觉模型更强的视觉理解能力,支持更长上下文

快速使用

基本图像理解

from zai import ZhipuAiClient
import base64

client = ZhipuAiClient(api_key="YOUR_API_KEY")

# 读取本地图片并转为 base64
with open("image.jpg", "rb") as f:
    img_base = base64.b64encode(f.read()).decode("utf-8")

response = client.chat.completions.create(
    model="glm-4.6v",
    messages=[{
        "role": "user",
        "content": [
            {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{img_base}"}},
            {"type": "text", "content": "描述这张图片"}
        ]
    }],
    thinking={"type": "enabled"}
)
print(response.choices[0].message.content)

使用图片URL

response = client.chat.completions.create(
    model="glm-4.6v",
    messages=[{
        "role": "user",
        "content": [
            {"type": "image_url", "image_url": {"url": "https://example.com/image.jpg"}},
            {"type": "text", "content": "这张图片里有什么?"}
        ]
    }]
)

多图理解

response = client.chat.completions.create(
    model="glm-4.6v",
    messages=[{
        "role": "user",
        "content": [
            {"type": "image_url", "image_url": {"url": "图片1 base64 或 URL"}},
            {"type": "image_url", "image_url": {"url": "图片2 base64 或 URL"}},
            {"type": "text", "content": "比较这两张图片的异同"}
        ]
    }]
)

视频理解(GLM-4.6V)

# 支持理解视频内容
response = client.chat.completions.create(
    model="glm-4.6v",
    messages=[{
        "role": "user",
        "content": [
            {"type": "video_url", "video_url": {"url": "视频URL"}},
            {"type": "text", "content": "描述这个视频的内容"}
        ]
    }]
)

使用脚本

项目中已包含脚本 script/infer_glmv.py,可直接调用:

import sys
sys.path.append('/Users/guobaokui/.openclaw/workspace_multmodal/skills/glm-v-model/script')
from infer_glmv import glm_v

# 使用方式
# glm_v(['image.jpg'], '描述图片', 'glm-4.6v')

常用场景

场景Prompt 示例
图片描述"详细描述这张图片的内容"
图表分析"分析这张图表数据"
文字识别(OCR)"提取图片中的文字"
物体识别"图片中有哪些物体"
场景理解"这是什么地方"
多图对比"比较这两张图片的异同"
视频理解"总结这个视频的内容"

注意事项

  1. API Key: 需要智谱 AI 的 API Key,可从 https://open.bigmodel.cn 获取
  2. 图片格式: 支持 JPEG、PNG、WebP 等常见格式
  3. 图片大小: 单张图片建议不超过 10MB
  4. thinking: 可启用深度思考模式 thinking={"type": "enabled"}
  5. 计费: 按 token 计费,图片会转换为 token 消耗

Comments

Loading comments...