Back to skill
Skillv1.0.0
ClawScan security
Vision Helper — AI Image Analysis · ClawHub's context-aware review of the artifact, metadata, and declared behavior.
Scanner verdict
BenignApr 28, 2026, 10:40 AM
- Verdict
- benign
- Confidence
- high
- Model
- gpt-5-mini
- Summary
- The skill does what it says — it reads an image file, encodes it, and sends it to an Ollama endpoint for vision analysis — and its code, instructions, and purpose are internally consistent, though it carries the expected privacy risks of screenshotting and sending images to a network endpoint.
- Guidance
- This skill appears to be what it claims: a helper that reads an image file and sends it to an Ollama instance for analysis. Before installing or using it, consider the following: - Privacy: The script will read any readable file with an allowed extension and base64-encode it. If you take desktop/browser screenshots you may capture passwords, private chats, or other sensitive data. - Endpoint trust: By default the script posts to http://localhost:11434/api/chat. If you change OLLAMA_API_URL to a remote URL, those images (and any textual prompt) will be transmitted to that remote service. Only point it to endpoints you trust. - File validation is extension-based and the path-traversal check is simplistic ('..' substring). Don't feed files you don't trust; avoid symlink/renamed files containing sensitive content. - Automation caution: The README suggests using model output to drive clicks or inputs; make sure any automation steps are safe and tested before running with real privileges or on critical systems. Practical steps: run a local Ollama instance and keep OLLAMA_API_URL at its default if you want privacy; inspect or run the included script in a sandbox first; avoid passing images containing secrets; and do not set OLLAMA_API_URL to an external service unless you control or trust it.
Review Dimensions
- Purpose & Capability
- okName/description match the implementation: the included Python script encodes an image and calls an Ollama chat API with a vision model. The script supports model selection and extended timeout as advertised.
- Instruction Scope
- noteSKILL.md explicitly instructs using exec to take and analyze screenshots (browser, desktop tools) and to 'act' on analysis results (clicks/input). That is within the skill's stated automation use-cases, but it carries privacy and automation-safety implications (desktop screenshots may contain sensitive data; automated actions driven by model output can have undesired effects).
- Install Mechanism
- okInstruction-only skill with no install spec; included script is plain Python and there are no downloads or external installers. This is a low-risk install surface.
- Credentials
- noteThe registry metadata lists no required env vars, but SKILL.md and the script use optional env vars (OLLAMA_API_URL, VISION_MODEL, VISION_TIMEOUT). Defaults point to localhost, which is reasonable, but changing OLLAMA_API_URL to a remote endpoint would send base64-encoded images off-host. The env usage is proportionate to functionality but carries obvious exfiltration/privacy risks if pointed at an untrusted service. Also, the script enforces allowed extensions by filename only (and a simple '..' check), which could be abused if non-image data is disguised with an allowed extension.
- Persistence & Privilege
- okalways is false and the skill does not request ongoing system presence or modify other skills. It runs on-demand via exec and does not request elevated privileges.
