Vision Tagger

PassAudited by ClawScan on May 1, 2026.

Overview

Vision Tagger appears to be a coherent local macOS image-analysis skill, with the main cautions being its setup commands and the sensitive text, barcode, and person-related data it can extract from images.

This looks safe to use for local image tagging on macOS. Before installing, be aware that setup installs Pillow and compiles a Swift helper, and that analysis results may reveal private text, QR codes, faces, or body-pose details from the images you provide.

Findings (2)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

Note

ASI04: Agentic Supply Chain Vulnerabilities

What this means

Installing the skill may run local setup commands and install Pillow before image analysis works.

Why it was flagged

The skill asks the user to install a Python package and compile a local Swift helper. This is purpose-aligned for a macOS Vision image tool, but users should notice that setup depends on local command execution and a package install.

Skill content

pip3 install Pillow

# Compile the Swift binary
cd scripts/
swiftc -O -o image_tagger image_tagger.swift

Recommendation

Run setup only from the reviewed skill directory, use your normal trusted Python environment, and confirm the macOS/Xcode requirements before installing.

Note

ASI06: Memory and Context Poisoning

What this means

Text or QR/barcode contents from an image may be placed into the agent conversation or output files.

Why it was flagged

The skill intentionally extracts OCR text and barcode payloads from images. That content is untrusted image-derived data and may also contain private information.

Skill content

`text` — OCR results with bounding boxes
- `barcodes` — QR codes, UPC, etc.

Recommendation

Treat OCR and barcode results as data, not instructions, and avoid running this on images containing secrets unless you are comfortable with those results appearing in the agent context.