Kb Collector
v1.2.1Knowledge Base Collector - save YouTube, URLs, text to Obsidian with AI summarization. Auto-transcribes videos, fetches pages, supports weekly/monthly digest...
Security Scan
OpenClaw
Suspicious
medium confidencePurpose & Capability
Name/description align with the included scripts: downloading YouTube audio, transcribing, fetching pages, saving to an Obsidian vault, and generating digests/nightly research. Some minor mismatches: SKILL.md claims it 'searches multiple sources (Hacker News, Reddit, Twitter)', but nightly-research.sh performs searches via the Tavily API only (it does not independently query those sites). Overall capabilities match the stated purpose.
Instruction Scope
SKILL.md and scripts instruct the agent to fetch remote web pages and call external services (Tavily API via curl, and send email via the 'gog gmail send' tool). The scripts also write files into an Obsidian vault path and remove temporary audio files. The SKILL metadata declared no required env vars, but the scripts read/expect environment variables (TAVILY_API_KEY, OBSIDIAN_VAULT, RECIPIENT) and use a hard-coded email recipient and hard-coded vault paths (/Users/george/... and ~/Documents/Georges/Knowledge). These runtime actions (external API calls and email sending) are outside the declared requirements and should be explicitly disclosed.
Install Mechanism
There is no formal install spec in the registry (instruction-only), which is lower risk from an automatic installer perspective. The SKILL.md tells the user to pip install packages (yt-dlp, faster-whisper, requests, beautifulsoup4, optional openai/anthropic). That is consistent with the code. No downloads from arbitrary URLs or archive extraction are present. Because the code relies on external binaries (yt-dlp, whisper) and a third-party CLI tool 'gog', the user must install those manually — the absence of an install spec means the skill won't auto-install them but the runtime will fail or behave unexpectedly if they are missing.
Credentials
Registry lists no required environment variables or credentials, yet scripts make use of environment vars: TAVILY_API_KEY (sent to api.tavily.com), OBSIDIAN_VAULT (overrides VAULT), and RECIPIENT. digest.sh and nightly-research.sh also use or assume the presence of an email-sending tool ('gog') which uses credentials not declared here. The scripts also include hard-coded local paths and a hard-coded recipient email (george@precaster.com.tw). Asking for or using API keys and email-sending capabilities is proportionate to the feature set — but they should be declared, and the hard-coded recipient is suspicious/unexpected behavior that could lead to unintended data exfiltration.
Persistence & Privilege
The skill does not request permanent platform-level privileges (always: false) and does not modify other skills' configuration. It writes notes to a user-visible Obsidian vault and temporary files in /tmp, which is expected given its purpose. Autonomous invocation is allowed (default) — combined with the environment/credential concerns above this increases potential impact, but the skill alone does not request 'always' or system-wide config changes.
What to consider before installing
What to check before installing/using this skill:
- Expectation mismatch: The registry/metadata declare no env vars but the scripts read TAVILY_API_KEY, OBSIDIAN_VAULT, and RECIPIENT. Confirm whether you should provide any API keys and where those keys will be used.
- Vault path & recipient: The Python and shell scripts default to a specific user's vault path (/Users/george/... or ~/Documents/Georges/Knowledge). Edit VAULT_PATH/VAULT/OBSIDIAN_VAULT to point to your own vault before running, and replace the hard-coded RECIPIENT (george@precaster.com.tw) with your address or remove email sending if undesired.
- External network and email: nightly-research.sh contacts https://api.tavily.com and digest/nightly can send mail via 'gog gmail send'. If you enable those features, you will send search queries and possibly note content to external services. Only set TAVILY_API_KEY if you trust Tavily and understand their data use. Inspect/verify how 'gog' is configured for Gmail on your machine — it may reuse stored credentials to send mail.
- Data exfil channels: The main exfil vectors here are (1) posting queries/results to Tavily, and (2) sending digests via the 'gog' tool. There is no obfuscated code or hidden endpoints, but these channels can leak note contents if misconfigured.
- Run in a safe environment first: Execute scripts in a sandbox or a test account/vault, with no sensitive notes present. Replace hard-coded values, and run with network disabled if you want only local behavior.
- Dependency hygiene: The SKILL.md asks you to pip install yt-dlp, faster-whisper, etc. Those packages and the external binaries (yt-dlp, whisper, gog) will run code on your machine. Install them from official sources and review their own security considerations.
- Ask the author / request metadata: The skill lacks homepage/author contact and doesn't declare env vars. If you plan to use it long-term, ask the publisher to add explicit docs for required credentials and configurable defaults (vault path, recipient), or update the skill to avoid hard-coded user-specific paths and recipients.
If you are uncomfortable with network calls or automatic email sending, either remove/disable those parts of the scripts or decline to install. If you proceed, make the environment variables explicit and verify behavior with small, non-sensitive test data first.Like a lobster shell, security has layers — review code before you run it.
latest
KB Collector
Knowledge Base Collector - Save YouTube, URLs, and text to Obsidian with automatic transcription and summarization.
Features
- YouTube Collection - Download audio, transcribe with Whisper, auto-summarize
- URL Collection - Fetch and summarize web pages
- Plain Text - Direct save with tags
- Digest - Weekly/Monthly/Yearly review emails
- Nightly Research - Automated AI/LLM/tech trend tracking
Installation
# Install dependencies
pip install yt-dlp faster-whisper requests beautifulsoup4
# For AI summarization (optional)
pip install openai anthropic
Usage (Python Version - Recommended)
# Collect YouTube video
python3 scripts/collect.py youtube "https://youtu.be/xxxxx" "stock,investing"
# Collect URL
python3 scripts/collect.py url "https://example.com/article" "python,api"
# Collect plain text
python3 scripts/collect.py text "My note content" "tag1,tag2"
Usage (Bash Version - Legacy)
# Collect YouTube
./scripts/collect.sh "https://youtu.be/xxxxx" "stock,investing" youtube
# Collect URL
./scripts/collect.sh "https://example.com/article" "python,api" url
# Collect plain text
./scripts/collect.sh "My note" "tag1,tag2" text
Nightly Research (New!)
Automated AI/LLM/tech trend tracking - runs daily and saves to Obsidian.
# Save to Obsidian only
./scripts/nightly-research.sh --save
# Save to Obsidian AND send email
./scripts/nightly-research.sh --save --send
# Send email only
./scripts/nightly-research.sh --send
Features
- Searches multiple sources (Hacker News, Reddit, Twitter)
- LLM summarization (optional)
- Saves to Obsidian with tags
- Optional email digest
Cron Setup (optional)
# Run every night at 10 PM
0 22 * * * /path/to/nightly-research.sh --save --send
Configuration
Edit the script to customize:
VAULT_PATH = os.path.expanduser("~/Documents/YourVault")
NOTE_AUTHOR = "YourName"
Output Format
Notes saved to: {VAULT_PATH}/yyyy-mm-dd-title.md
---
created: 2026-03-03T12:00:00
source: https://...
tags: [stock, investing]
author: George
---
# Title
> **TLDR:** Summary here...
---
Content...
---
*Saved: 2026-03-03*
Dependencies
- yt-dlp
- faster-whisper (for transcription)
- requests + beautifulsoup4 (for URL fetching)
- Optional: openai/anthropic (for AI summarization)
Credits
Automated note-taking workflow for Obsidian.
Comments
Loading comments...
