Install
openclaw skills install ai-photosClawHub Security found sensitive or high-impact capabilities. Review the scan results before using.
Personal AI photo album for OpenClaw. Use when users say: - "index my photos" - "set up an AI photo album" - "search my photo library" - "reconnect my photo album" - "find photos of ..."
openclaw skills install ai-photosai-photos turns one or more local photo sources into a searchable AI photo album for OpenClaw.
Supported formats:
jpg, jpeg, png, webp, heicjpg, jpeg, png, webpheic: best-effort only; do not promise captioning or preview supportWhen talking to users:
Suggested user-facing capability summary:
This task is not complete until all of the following are true:
HEARTBEAT.md, and one verification heartbeat has runUse these terms for agent reasoning, troubleshooting, or recovery only. Do not introduce them to the user unless needed.
photo sources: one or more local paths scanned into the same albumalbum backend: where the searchable photo index is storedalbum profile: saved reconnect information, stored automatically under ~/.openclaw/ai-photos/albums/default.jsoncaption input JSONL: the manifest file that still needs vision captions and importIf the user asks what to save for later, explain that OpenClaw saves the reconnect information automatically at ~/.openclaw/ai-photos/albums/default.json, and that they only need to keep that file if they want a manual backup.
Each captioned JSONL line should contain the original manifest fields plus vision-model output.
Required base fields:
file_pathfilenamesha256mime_typesize_byteswidthheighttaken_atexifVision fields:
caption: one short factual sentencetags: array of 5-12 short tagsscene: short scene labelobjects: array of the main visible objectstext_in_image: visible text or nullOptional fields:
metadata: free-form JSON objectsearch_text: concatenated retrieval text; if omitted, the importer builds itExample:
{
"file_path": "/photos/2026/03/cat.jpg",
"filename": "cat.jpg",
"sha256": "abc123",
"mime_type": "image/jpeg",
"size_bytes": 231231,
"width": 3024,
"height": 4032,
"taken_at": "2026-03-12T09:12:00+00:00",
"exif": {"Make": "Apple", "Model": "iPhone 15 Pro"},
"caption": "A white cat resting on a gray sofa near a sunlit window.",
"tags": ["cat", "sofa", "indoor", "sunlight", "pet"],
"scene": "living room",
"objects": ["cat", "sofa", "window"],
"text_in_image": null,
"metadata": {"source": "demo"}
}
This skill does not depend on a local Python environment or a checked-out Go source tree.
It uses the latest published ai-photos CLI release from:
https://github.com/zoubingwu/openclaw-ai-photos~/.openclaw/ai-photos/bin~/.openclaw/ai-photos/bin/ai-photosAt the start of every ai-photos task, run the bootstrap flow exactly once and reuse the resulting binary path for the rest of the task.
Run this shell block and capture its stdout as AI_PHOTOS_BIN:
ensure_ai_photos() {
AI_PHOTOS_REPO="zoubingwu/openclaw-ai-photos"
AI_PHOTOS_BIN_DIR="$HOME/.openclaw/ai-photos/bin"
AI_PHOTOS_BIN="$AI_PHOTOS_BIN_DIR/ai-photos"
mkdir -p "$AI_PHOTOS_BIN_DIR"
os="$(uname -s | tr '[:upper:]' '[:lower:]')"
case "$os" in
darwin) goos="darwin" ;;
linux) goos="linux" ;;
*)
echo "unsupported platform: $os" >&2
return 1
;;
esac
arch="$(uname -m)"
case "$arch" in
x86_64|amd64) goarch="amd64" ;;
arm64|aarch64) goarch="arm64" ;;
*)
echo "unsupported architecture: $arch" >&2
return 1
;;
esac
archive_name="ai-photos_${goos}_${goarch}.tar.gz"
archive_url="https://github.com/${AI_PHOTOS_REPO}/releases/latest/download/${archive_name}"
tmp_dir="$(mktemp -d)"
had_existing_binary=0
if [ -x "$AI_PHOTOS_BIN" ]; then
had_existing_binary=1
fi
if curl -fL "${archive_url}" -o "$tmp_dir/${archive_name}" \
&& tar -xzf "$tmp_dir/${archive_name}" -C "$tmp_dir" \
&& install -m 0755 "$tmp_dir/ai-photos" "$AI_PHOTOS_BIN"; then
rm -rf "$tmp_dir"
printf '%s\n' "$AI_PHOTOS_BIN"
return 0
fi
rm -rf "$tmp_dir"
if [ "$had_existing_binary" -eq 1 ]; then
printf '%s\n' "$AI_PHOTOS_BIN"
return 0
fi
echo "could not download ai-photos release archive" >&2
return 1
}
AI_PHOTOS_BIN="$(ensure_ai_photos)"
Rules:
releases/latest/download/... and does not call api.github.com"$AI_PHOTOS_BIN" help or "$AI_PHOTOS_BIN" help <subcommand>User-facing:
[AGENT] Branching:
1: continue to Step 12: continue to Step 3 and Step 43: go directly to Search flowUser-facing:
[AGENT]
Do not continue until the user has provided at least one photo source.
User-facing:
[AGENT]
Before indexing anything, verify:
agents.defaults.imageModel is vision-capable"$AI_PHOTOS_BIN" prepare-imageSuggested preflight sequence:
"$AI_PHOTOS_BIN" prepare-image --mode caption <sample-file>"$AI_PHOTOS_BIN" prepare-image --mode preview <sample-file>If the image backend check fails:
heic and local preview preparation depend on sipsjpg, jpeg, png, or webp; OpenClaw can still caption those files directly from the original pathIf preflight fails:
[AGENT]
db9 if it is installed and usabledb9 is not available, use TiDB Cloud ZeroTiDB Cloud Zero, tell the user to claim it if they want to keep it, but do not lead with backend details unless they matterUser-facing for a new album:
[AGENT]
For a new album, run exactly one setup command:
# db9
"$AI_PHOTOS_BIN" setup --source <photo-source-a> --source <photo-source-b> --backend db9 --target <db>
# TiDB
"$AI_PHOTOS_BIN" setup --source <photo-source-a> --source <photo-source-b> --backend tidb --target /path/to/tidb-target.json
Read the JSON output:
profile_path tells you where the default album profile was savedcaption_input_jsonl is the input for the first record ingestion passsync.to_caption tells you how many records still need captions and import[AGENT] For reconnect:
Suggested reconnect check:
"$AI_PHOTOS_BIN" search --recent --limit 1
Do not continue until the backend is confirmed reachable.
Use this same flow for:
User-facing:
[AGENT]
Input:
caption_input_jsonl from ai-photos setupincremental_manifest_jsonl from ai-photos syncBefore generating records, read the Caption schema section in this file.
[AGENT] For each record in the input manifest:
"$AI_PHOTOS_BIN" prepare-image --mode caption <file_path>output_path to the vision-capable modelcaption, tags, scene, objects, and text_in_image"$AI_PHOTOS_BIN" import /tmp/photos.captioned.jsonl
Rules:
prepare-image prefers macOS sips when available and also supports ImageMagick for Linux-friendly setupsprepare-image returns the original file path in caption mode, continue with that file instead of blocking the batchjpg, jpeg, png, and webp when no local image backend is availableheic captioning or preview supportfile_path with the temporary derived image pathUser-facing:
[AGENT]
If the user declines:
HEARTBEAT.mdIf the user says yes:
Then update <workspace>/HEARTBEAT.md without removing unrelated content:
<!-- ai-photos:auto-indexing:start -->
## ai-photos automatic indexing
- Read and learn how to use `ai-photos` skill
- Use `~/.openclaw/ai-photos/bin/ai-photos sync` to scan the configured photo folders for changes.
- Check the configured photo folders for changes and keep the album index up to date.
- If `to_caption` is `0`, it means nothing needs attention, reply `HEARTBEAT_OK`.
- If `to_caption` is greater than `0`, run the shared record ingestion flow using `incremental_manifest_jsonl`.
- Stay quiet unless indexing failed or user action is needed.
<!-- ai-photos:auto-indexing:end -->
Do not rewrite the whole file just to add this block.
Then verify once:
Then tell the user the result:
User-facing handoff should include:
Keep the handoff short and user-facing. Default to readiness, status, and next actions. Only include implementation details when the user asks or recovery requires them.
[AGENT]
Immediately after setup:
When the user asks to find photos, run:
"$AI_PHOTOS_BIN" search --text "cat on sofa"
"$AI_PHOTOS_BIN" search --tag cat
"$AI_PHOTOS_BIN" search --date 2026-03
"$AI_PHOTOS_BIN" search --recent
When answering:
"$AI_PHOTOS_BIN" prepare-image --mode preview <matched-file>output_path when possibleWhen the user asks to open a browser view for the album:
If the user wants to open the gallery from another device:
"$AI_PHOTOS_BIN" serve --host 0.0.0.0 only when they explicitly want remote accessai-photos; for remote access, share the machine's Tailscale IP or MagicDNS name insteadai-photos, not on the remote clientRun:
"$AI_PHOTOS_BIN" serve
If the user wants a specific album profile:
"$AI_PHOTOS_BIN" serve --profile default
The web service provides:
ai-photosWhen handing the web UI to the user:
When a heartbeat arrives for a configured album:
"$AI_PHOTOS_BIN" sync
to_caption is 0, return HEARTBEAT_OKto_caption is greater than 0, run the shared record ingestion flow using incremental_manifest_jsonl