Natural Language Video Search

Semantic search over video files using Gemini embeddings. Index dashcam, security camera, or any mp4 footage, then search with natural language queries to fi...

MIT-0 · Free to use, modify, and redistribute. No attribution required.
1 · 26 · 0 current installs · 0 all-time installs
bySoham Rajadhyaksha@ssrajadh
MIT-0
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name and description (semantic search over video using Gemini embeddings) match the declared requirements: GEMINI_API_KEY and binaries ffmpeg and python3 are appropriate and expected for chunking/processing video and calling the embedding API.
Instruction Scope
SKILL.md instructs the agent to clone https://github.com/ssrajadh/sentrysearch and run pip install -e ., then to run commands that read arbitrary video files/directories and upload video chunks to Gemini. Reading user video files is consistent with purpose, but uploading private footage to an external API is a significant privacy consideration and should be explicitly consented to by the user before indexing.
Install Mechanism
There is no registry install spec, but the instructions explicitly request git clone from a GitHub repository and pip install -e ., which is a common and reasonable install path. The repo is hosted on github.com (a known host), lowering but not eliminating supply-chain risk — users should inspect or trust the repo before running pip install.
Credentials
Only GEMINI_API_KEY is required and is declared as the primary credential. This is proportional to calling the Gemini embedding API; no unrelated secrets or excessive env vars are requested.
Persistence & Privilege
Skill does not request always: true and is user-invocable. It may create a local .env and store the API key locally (per the README), which is reasonable for this functionality and confined to the skill's scope.
Assessment
If you plan to install/invoke this skill: (1) Understand that it will read video files you point it at and send video chunks to the Gemini API — do not index sensitive/private footage unless you accept that those frames leave your device. (2) The SKILL.md asks you to git clone and pip install the referenced GitHub repo; only proceed if you trust or have reviewed that code. (3) Ensure your GEMINI_API_KEY has appropriate permissions and monitor usage (indexing can incur API costs). (4) Consider running the install and indexing in a sandboxed environment and review the repository before pip installing to reduce supply-chain risk.

Like a lobster shell, security has layers — review code before you run it.

Current versionv0.1.0
Download zip
embeddingsvk97ctb7t2g2pwv1krmexz8h8e583f11gfootagevk97ctb7t2g2pwv1krmexz8h8e583f11ggeminivk97ctb7t2g2pwv1krmexz8h8e583f11glatestvk97ctb7t2g2pwv1krmexz8h8e583f11gmediavk97ctb7t2g2pwv1krmexz8h8e583f11gsearchvk97ctb7t2g2pwv1krmexz8h8e583f11gvideovk97ctb7t2g2pwv1krmexz8h8e583f11g

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Runtime requirements

🎥 Clawdis
Binsffmpeg, python3
EnvGEMINI_API_KEY
Primary envGEMINI_API_KEY

SKILL.md

Natural Language Video Search

Search video files using natural language queries powered by Gemini Embedding 2's native video-to-vector embedding.

What This Skill Does

This skill lets you index video files (dashcam footage, security camera recordings, any mp4) into a local vector database, then search them by describing what you're looking for in plain English. The top match is automatically trimmed and saved as a clip.

Setup

  1. Clone and install:
git clone https://github.com/ssrajadh/sentrysearch.git
cd sentrysearch
pip install -e .
  1. Set your Gemini API key:
sentrysearch init

This prompts for your key, writes it to .env, and validates it with a test embedding. You can also set GEMINI_API_KEY directly as an environment variable.

Commands

Index video files

sentrysearch index <directory_or_file>

Options: --chunk-duration (default 30s), --overlap (default 5s), --no-preprocess, --target-resolution, --target-fps, --skip-still / --no-skip-still

Search indexed footage

sentrysearch search "<natural language query>"

Options: -n / --results (default 5), -o / --output-dir, --trim / --no-trim

Check index stats

sentrysearch stats

How It Works

Video files are split into overlapping chunks. Still-frame detection can skip chunks with no meaningful visual change, eliminating unnecessary API calls — this is the primary cost saver for idle footage like sentry mode or security cameras. Chunks are also preprocessed (reduced frame rate and resolution) to shrink upload size and speed up transfers, though the Gemini API bills based on video duration at a fixed token rate, not file size, so preprocessing does not reduce per-chunk token cost. Each chunk is embedded as raw video using Gemini Embedding 2 (no transcription or captioning). Vectors are stored in a local ChromaDB database. Text queries are embedded into the same vector space and matched via cosine similarity. The top match is auto-trimmed from the original file via ffmpeg.

When To Use This Skill

  • User asks to search through video files or footage
  • User wants to find a specific moment in a video by describing it
  • User asks to index or organize video footage for search
  • User mentions dashcam, security camera, or surveillance clips
  • User wants to find and extract a clip from a longer video

Example Interactions

User: "Search my dashcam footage for a white truck cutting me off" Action: Run sentrysearch search "white truck cutting me off"

User: "Index all the video files in my Downloads folder" Action: Run sentrysearch index ~/Downloads

User: "How much footage do I have indexed?" Action: Run sentrysearch stats

Rules

  • Always run sentrysearch init or confirm GEMINI_API_KEY is set before indexing or searching.
  • If ffmpeg is not found on PATH, the bundled imageio-ffmpeg fallback is used automatically.
  • Indexing costs ~$2.50/hour of active footage with default settings. Cost is driven by the number of chunks sent to the API — footage with long idle periods (sentry mode, security cameras) will be significantly cheaper since still-frame skipping eliminates those chunks entirely. Warn the user before indexing large directories.
  • Search results include similarity scores. Scores below 0.5 are unlikely to be meaningful matches.

Files

1 total
Select a file
Select a file to preview.

Comments

Loading comments…