Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

Research Library

v0.1.0

Local-first multimedia research library for hardware projects. Capture code, CAD, PDFs, images. Search with material-type weighting. Project isolation with cross-references. Async extraction. Backup + restore.

1· 1.4k·3 current·3 all-time
byWizardSage@jonbuckles
Security Scan
VirusTotalVirusTotal
Suspicious
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description (local multimedia research library) align with the files and CLI: SQLite + FTS5, extractors for PDFs/images/code, async workers, backup/restore, and project scoping. The included modules (cli, extractor, search, worker, database) match the stated functionality.
Instruction Scope
The SKILL.md and other docs instruct the agent/user to import local files, run extraction, and store DB/backups under a home-path (e.g. ~/.openclaw/research). That is expected for a local-first tool, but the skill will read and copy arbitrary files the user points it at and will create a local DB and attachments directory. The docs reference environment variables (RESLIB_DATA_DIR / RESLIB_DB) and CLI options for overriding paths; SKILL.md does not request any unrelated files or cloud endpoints. Recommendation: review and set data-dir before bulk imports if you want to control where files are written.
Install Mechanism
The registry shows no automated install spec in the skill bundle, but the package includes full Python code, an entry_point, and a _meta.json with dependencies (pdfplumber, pytesseract, click). SKILL.md suggests pip install /path/to/research-library or clawhub install. No remote downloads or odd URLs were present in the provided files. Minor packaging inconsistency: the skill is described as 'instruction-only' but contains code and a package manifest.
Credentials
The skill does not request credentials or secrets and does not declare required environment variables in the registry. Docs/CLI mention optional env vars (RESLIB_DATA_DIR, RESLIB_DB) and dependencies (pytesseract plus the system tesseract binary). The only notable mismatch: system 'tesseract-ocr' is an optional runtime dependency referenced in docs but not declared as a required binary in the registry. No credentials (API keys, AWS, etc.) are requested.
Persistence & Privilege
The skill is not always: true and does not request special platform privileges. It stores data in user-visible locations (default ~/.openclaw/research/) and creates backups there; that is consistent with a local-first CLI tool. Autonomous invocation is enabled by default (standard behavior) but combined with no broad credentials or network endpoints, the risk is limited to local file operations.
Assessment
This skill appears to be what it says: a local CLI research library that indexes files you add and stores a SQLite DB and attachments locally. Before installing or running it: 1) Review and if desired override the default data/db path (RESLIB_DATA_DIR or --db) so imports and backups go to a directory you control; 2) install optional OCR prerequisites (system tesseract-ocr) only if you need OCR; 3) because the skill bundle contains executable Python code from an unknown source (no homepage/repo owner is authoritative in the metadata), consider inspecting reslib/cli.py and reslib/extractor.py for any unexpected network calls or behavior and run the package in an isolated environment or container for initial testing; 4) run the included test suite (pytest) or smoke_test.sh in a sandbox before pointing it at large or sensitive directories; 5) there are no requested credentials, but verify there are no hardcoded endpoints in the code if you want to ensure data never leaves your machine.

Like a lobster shell, security has layers — review code before you run it.

latestvk9794pbdhtmnksbnxve5vnm7f980q0tr
1.4kdownloads
1stars
1versions
Updated 5h ago
v0.1.0
MIT-0

Research Library Skill

A local-first multimedia research library for capturing, organizing, and searching hardware project knowledge.

What It Does

  • Store documents — Code, PDFs, CAD files, images, schematics
  • Extract automatically — Text from PDFs, EXIF from images, functions from code
  • Search intelligently — Full-text with material-type weighting (your work ranks higher than external research)
  • Project isolation — Arduino separate from CNC; no contamination
  • Cross-reference — Link knowledge: "this servo tuning applies to that project"
  • Async extraction — Searches never block while OCR runs
  • Backup daily — 30-day rolling snapshots

Installation

clawhub install research-library
# OR
pip install /path/to/research-library

Quick Start

# Initialize database
reslib status

# Add a project
reslib add ~/projects/arduino/servo.py --project arduino --material-type reference

# Search
reslib search "servo tuning"

# Link knowledge
reslib link 5 12 --type applies_to

Features

CLI Commands

  • reslib add — Import documents (auto-detect + extract)
  • reslib search — Full-text search with filters
  • reslib get — View document details
  • reslib archive / reslib unarchive — Manage documents
  • reslib export — Export as JSON/Markdown
  • reslib link — Create document relationships
  • reslib projects — Manage projects
  • reslib tags — Manage tags
  • reslib status — System overview
  • reslib backup / reslib restore — Snapshots
  • reslib smoke_test.sh — Quick validation

Technical

  • Storage: SQLite 3.45+ with FTS5 virtual table
  • Extraction: PDF (pdfplumber + OCR), images (EXIF + OCR), code (AST + regex)
  • Confidence Scoring: 0.0-1.0 based on quality + source
  • Material Weighting: Reference (1.0) vs Research (0.5)
  • Project Isolation: Scoped searches, no contamination
  • Async Workers: 2-4 configurable extraction workers
  • Catalog Separation: real_world vs openclaw projects
  • Backup: Daily snapshots, 30-day retention

Configuration

Copy reslib/config.json and customize:

{
  "db_path": "~/.openclaw/research/library.db",
  "num_workers": 2,
  "worker_timeout_sec": 300,
  "max_retries": 3,
  "backup_retention_days": 30,
  "backup_dir": "~/.openclaw/research/backups",
  "file_size_limit_mb": 200,
  "project_size_limit_gb": 2
}

Integration with War Room

Use RL1 protocol in war room DNA:

from reslib import ResearchDatabase, ResearchSearch

db = ResearchDatabase()
search = ResearchSearch(db)

# Before researching, check existing knowledge
prior = search.search("servo tuning", project="rc-quadcopter")
if prior:
    print(f"Found {len(prior)} prior items")
else:
    # New research needed...
    db.add_research(title="...", content="...", ...)

Performance

All targets exceeded:

OperationTargetActual
PDF extraction<100ms20.6ms
Search (50 docs)<100ms0.33ms
Worker throughput>6/sec414.69/sec

Testing

# Run all tests
pytest tests/

# Quick smoke test
bash reslib/smoke_test.sh

# Performance tests
pytest tests/test_integration.py -v -k stress

Known Limitations (Phase 2)

  • OCR quality varies on hand-drawn sketches
  • FTS5 designed for <10K documents (PostgreSQL path for scale)
  • No automatic web research gathering (manual only)
  • Vector embeddings ready but inactive
  • CAD file parsing is metadata-only

Documentation

See /docs/:

  • CLI-REFERENCE.md — All commands + examples
  • EXTRACTION-GUIDE.md — How extraction works
  • SEARCH-GUIDE.md — Ranking + weighting
  • WORKER-GUIDE.md — Async queue details
  • INTEGRATION.md — War room RL1 protocol

Phase 2 Roadmap

  • Real-world PDF calibration
  • FTS5 scaling tests (10K docs)
  • Auto-detection (reference vs research)
  • Web research enrichment
  • Vector embeddings (semantic search)
  • PostgreSQL upgrade path

Building From Source

cd research-library
pip install -e .
pytest tests/
python -m reslib status

Support

Issues? See TECHNICAL-NOTES.md for troubleshooting.


Production-ready MVP. 214 tests passing. 15K lines. Ready to use.

Comments

Loading comments...