Vision Tagger
PassAudited by ClawScan on May 1, 2026.
Overview
Vision Tagger appears to be a coherent local macOS image-analysis skill, with the main cautions being its setup commands and the sensitive text, barcode, and person-related data it can extract from images.
This looks safe to use for local image tagging on macOS. Before installing, be aware that setup installs Pillow and compiles a Swift helper, and that analysis results may reveal private text, QR codes, faces, or body-pose details from the images you provide.
Findings (2)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
Installing the skill may run local setup commands and install Pillow before image analysis works.
The skill asks the user to install a Python package and compile a local Swift helper. This is purpose-aligned for a macOS Vision image tool, but users should notice that setup depends on local command execution and a package install.
pip3 install Pillow # Compile the Swift binary cd scripts/ swiftc -O -o image_tagger image_tagger.swift
Run setup only from the reviewed skill directory, use your normal trusted Python environment, and confirm the macOS/Xcode requirements before installing.
Text or QR/barcode contents from an image may be placed into the agent conversation or output files.
The skill intentionally extracts OCR text and barcode payloads from images. That content is untrusted image-derived data and may also contain private information.
`text` — OCR results with bounding boxes - `barcodes` — QR codes, UPC, etc.
Treat OCR and barcode results as data, not instructions, and avoid running this on images containing secrets unless you are comfortable with those results appearing in the agent context.
