Zotero Vectorize

v0.1.0

Build and maintain a cross-platform local Zotero semantic index using metadata embeddings and PDF full-text chunk embeddings. Use when the user asks to vecto...

0· 256·1 current·1 all-time
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description match the included scripts and references: the code reads a Zotero SQLite DB and storage, extracts metadata and PDF text, creates embeddings, and writes local vector store files. Environment variables referenced (ZOTERO_DATA_DIR, ZOTERO_DB, ZOTERO_STORAGE, ZOTERO_VECTORS_DIR) are appropriate for locating Zotero data and output and are optional. No unrelated credentials, binaries, or config paths are requested.
Instruction Scope
SKILL.md describes and limits runtime behavior to detecting Zotero paths, snapshotting the DB, extracting text, creating embeddings, verifying counts, and backing up/writing vector store files. The included scripts follow that flow and do not attempt to read unrelated system files or contact external endpoints (beyond embedding/model loading which is expected). Note: the workflow relies on creating DB snapshots and writing backups/output files; the 'ask for user confirmation before applying updates' is a process rule (scripts don't prompt interactively) — the agent/user must run check_incremental_updates first and then explicitly run apply_incremental_updates.
Install Mechanism
No install spec is included (instruction-only installation), and all code is provided. Dependencies are standard Python packages for embedding and PDF extraction (sentence-transformers, torch, PyMuPDF, numpy). There are no downloads from obscure URLs in the install phase. Runtime model downloads (HuggingFace/SentenceTransformers) may occur when loading embeddings — this is expected for the purpose, but is a network operation to be aware of.
Credentials
The skill requests no secrets or privileged environment variables. It optionally reads path-related env vars for convenience (ZOTERO_*), which are coherent with its function. There are no unrelated tokens, passwords, or cloud credentials requested.
Persistence & Privilege
always is false and the skill does not request permanent elevated platform privileges. It writes backups and store files only to the configured output directory and snapshots a user-provided Zotero DB path; it does not alter other skills or system-wide agent settings.
Assessment
This bundle appears to do what it says: it reads your local Zotero DB/storage, snapshots the DB, extracts PDF text, and builds local JSON vector stores. Before running: (1) review/confirm the output directory to avoid overwriting files; (2) run in a Python virtualenv and install the listed packages; (3) expect the embedding model to be downloaded (may be large and require network access) — you can pre-download models if offline; (4) close Zotero while snapshotting to avoid SQLite lock errors; (5) the safety rule to obtain user confirmation before applying incremental updates is procedural (the apply script does not itself prompt), so only run apply_incremental_updates after reviewing the check_incremental_updates report. If you need the skill to never download models automatically, inspect or modify get_embedding_model/encode_texts to load local models only.

Like a lobster shell, security has layers — review code before you run it.

latestvk97c1c275wht8egxf6bh762tcx82kra9

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Comments