kb-framework

Security checks across malware telemetry and agentic risk

Overview

This knowledge-base skill is mostly coherent, but it grants broad local file and update authority with several under-scoped or misleading safeguards.

Review this skill before installing. Use it only in a dedicated environment, keep watched/indexed directories narrow, do not point it at sensitive home or workspace trees, avoid the built-in updater unless you trust the release source, and back up Obsidian vaults and KB databases before enabling write, delete, watcher, scheduler, or cleanup commands.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Behavioral ASTexec() Call, eval() Call, Dynamic Import

Findings (33)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: try: # Validate script path to prevent command injection safe_path = _validate_script_path(migrate_script, kb_path) subprocess.run([sys.executable, str(safe_path)], check=True) except ValueError as e: print(f"⚠️ Migration script validation failed: {e}") except subprocess.CalledProcessError:
Confidence: 91% confidence
Finding: subprocess.run([sys.executable, str(safe_path)], check=True)

Intent-Code Divergence

Medium

Confidence: 96% confidence
Finding: The code comments and module contract say the watcher excludes generated output under kb/library/biblio/, but the actual default exclusion list is only {"llm"}. This mismatch can cause the watcher to ingest its own LLM-generated artifacts or other unintended directories, creating recursive processing, duplicate content amplification, and possible prompt/data contamination in downstream essence generation.

Intent-Code Divergence

Medium

Confidence: 97% confidence
Finding: The class-level documentation promises exclusion of LLM-generated output, but _should_exclude only filters path components matching names in the default set, which contains only "llm". In a system that automatically feeds discovered files into EssenzGenerator, this can widen the trust boundary and allow self-generated or untrusted derived content to be repeatedly reprocessed, increasing risk of feedback loops and poisoning of generated knowledge.

Intent-Code Divergence

Medium

Confidence: 98% confidence
Finding: `read_entry` accepts both relative and absolute paths, and only prepends the configured vault for relative paths. An absolute path is used as-is and only checked for existence, so callers can read any readable file on the host instead of being confined to the Obsidian vault, which breaks the documented trust boundary and can expose sensitive local data.

Intent-Code Divergence

High

Confidence: 95% confidence
Finding: The module is presented as a read-only integrity audit, but the same file also contains database-deletion logic in `cleanup_orphaned_fk()`. Even though that function is not called in `main()`, this mismatch is dangerous because operators or other code may trust the script as non-destructive and invoke or reuse it under false assumptions, leading to unintended data loss.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The module-level documentation overstates the security guarantees of this utility by claiming path traversal protection and broader file-handling security, while `sanitize_path` only constrains paths when `base_dir` is explicitly supplied. If callers trust the docstring and omit `base_dir`, attacker-controlled absolute or traversal-containing paths can still resolve to arbitrary filesystem locations, creating a misleading security boundary.

Intent-Code Divergence

Medium

Confidence: 94% confidence
Finding: The function documentation and surrounding logic indicate it should embed only section IDs missing from ChromaDB, but it instead calls run_full(limit=len(missing)), which performs a general embedding pass bounded only by count. This can re-embed arbitrary sections rather than the actual missing ones, causing data inconsistency, incomplete synchronization, wasted compute, and potentially leaving true missing sections unsynced while reporting success.

Context-Inappropriate Capability

High

Confidence: 98% confidence
Finding: The updater downloads code from a GitHub release, installs it, and then executes scripts/migrate.py from that updated content. Any compromise of the repository, release process, maintainer account, or network trust boundary would let an attacker run arbitrary code on the host under the user's privileges.

Missing User Warnings

Medium

Confidence: 82% confidence
Finding: The documentation exposes file-write, move, and delete operations against an Obsidian vault without any safety guidance, confirmation requirements, or warnings about destructive effects. In an agent skill context, this can normalize unsafe use of content-modifying APIs and increase the risk of accidental data loss or unauthorized modification if downstream automation invokes them blindly.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The file watcher and scheduler are described as automatically monitoring directories and running LLM jobs, yet the documentation gives no warning about implicit processing of newly added files or persistence of generated outputs. In a knowledge-base or agent environment, this can lead to unintended ingestion of sensitive files, privacy violations, and uncontrolled automated content generation.

Missing User Warnings

Medium

Confidence: 84% confidence
Finding: The README encourages LLM generation, engine switching, and testing across Ollama and HuggingFace backends without clearly warning that indexed content, prompts, or derived summaries may be transmitted to external or separately hosted model services. In a knowledge-base tool that processes user documents, this omission can lead users to expose sensitive local content under the assumption everything remains local.

Missing User Warnings

Medium

Confidence: 81% confidence
Finding: The README documents file watching, scheduling, and indexing of local content but does not warn users that the tool may continuously monitor directories and process file contents automatically. In the context of a personal knowledge base and agent integration, this can cause unintentional collection, indexing, and downstream exposure of sensitive files.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: The skill advertises LLM integration, multi-engine support, and provider switching but does not disclose that indexed content, prompts, or generated context may be transmitted to external services such as HuggingFace or to local model runtimes that still process sensitive data. In a knowledge-base skill that may index private files, this omission creates a meaningful privacy and data-handling risk because users may enable these features without understanding where their data goes.

Missing User Warnings

Medium

Confidence: 85% confidence
Finding: The documentation recommends `kb sync --delete-orphans` as a fix for ghost entries without clearly warning that this operation can permanently remove metadata or knowledge-base records. Users troubleshooting routine sync issues could run a destructive command and lose indexed state or references unintentionally.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The code automatically downloads NLTK tokenizer data at runtime if it is missing, which creates an unexpected outbound network action and introduces a supply-chain/dependency-fetch risk during execution. In security-sensitive or offline environments, this can violate policy, leak environment metadata, or allow unreviewed artifacts to be pulled without explicit operator approval.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The function logs raw user search queries and expanded queries, which can capture sensitive inputs such as medical, personal, or internal business terms. If logs are accessible to operators, aggregated centrally, or retained long-term, this creates a confidentiality risk and unnecessary exposure of user-provided data.

Missing User Warnings

Medium

Confidence: 78% confidence
Finding: The KEEP_BOTH path renames an existing vault note automatically and then writes a replacement KB version, causing silent file mutation in user content without confirmation. In a sync context, especially with attacker-controlled or malformed vault inputs, this can lead to integrity loss, broken links/workflows, or unexpected overwrites/duplication that are hard to detect and recover from.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: This class exposes note creation, modification, move, replacement, and deletion operations directly with no visible guardrails such as confirmation hooks, dry-run mode, path/scope restrictions, or explicit safety API boundaries in this file. In an agent setting, these methods materially increase the risk of unintended or over-broad filesystem changes if a higher-level workflow passes attacker-influenced paths or executes actions without user confirmation.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The function reads arbitrary file content from any absolute path without enforcing that the path belongs to the configured vault. In an agent or skill context, this widens the data-access scope from 'vault notes' to 'any local file the process can read,' enabling unintended disclosure of credentials, SSH keys, config files, or other private data if a caller can influence the path.

Missing User Warnings

Medium

Confidence: 84% confidence
Finding: delete_note allows permanent deletion when backup=False with no safeguard that the resolved path stays inside the vault root. In an agent context where file paths may be influenced by external input, this can enable unintended or destructive deletion of arbitrary files reachable by path traversal or absolute-path input.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: `cleanup_orphaned_fk()` performs irreversible `DELETE` operations on database records without any confirmation, dry-run mode, audit trail of deleted row identities, or user-facing disclosure in the normal audit flow. In an administrative skill, silent destructive maintenance increases the risk of accidental or unauthorized data loss if the function is later wired in, imported, or called by automation.

Missing User Warnings

Medium

Confidence: 85% confidence
Finding: The script recursively scans library and workspace paths and exports file metadata, including full paths, sizes, and modification times, to CSV and log files without any consent prompt, scoping control, or minimization. In environments where filenames or directory structures contain sensitive business or personal information, these outputs can leak inventory data and create a privacy/security exposure, especially because the scan includes the broader workspace and writes durable audit artifacts.

Unpinned Dependencies

Low

Category: Supply Chain
Content: # ML/Torch torch>=2.0.0 numpy tqdm # Utils
Confidence: 96% confidence
Finding: numpy

Unpinned Dependencies

Low

Category: Supply Chain
Content: # ML/Torch torch>=2.0.0 numpy tqdm # Utils requests
Confidence: 96% confidence
Finding: tqdm

Unpinned Dependencies

Low

Category: Supply Chain
Content: tqdm # Utils requests
Confidence: 96% confidence
Finding: requests

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal