Calibre Converter

Security checks across malware telemetry and agentic risk

Overview

The converter purpose is legitimate, but the package includes a much broader Calibre server with persistent library mutation, indexing, downloading, and third-party data-sharing capabilities that deserve review before installation.

Install only if you intend to run a local Calibre gateway, not just a lightweight conversion helper. Keep it bound to localhost, require a strong API key, avoid API keys in URLs, leave content downloads and network bindings disabled unless needed, and enable VirusTotal/OpenLibrary/direct URL download features only if you accept the related privacy and network exposure.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Behavioral ASTexec() Call, eval() Call, Dynamic Import
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (18)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: raise Exception("Ollama model is missing and automatic model pull is disabled") self._log(f"Model {self.embedding_model} not found. Installing...", "WARN") self._show_progress(0, 1, "Installing model", self.embedding_model) install_result = subprocess.run(['ollama', 'pull', self.embedding_model], capture_output=True, text=True) self._show_progress(1, 1, "Installing model", self.embedding_model) if install_result.returncode == 0:
Confidence: 89% confidence
Finding: install_result = subprocess.run(['ollama', 'pull', self.embedding_model], capture_output=True, text=True)

Context-Inappropriate Capability

Medium

Confidence: 84% confidence
Finding: The application conditionally enables an endpoint that exposes active network interface URLs, which can disclose internal host topology and reachable addresses to any party that can access the API. Even with API-key protection, this increases reconnaissance value for attackers and may reveal sensitive internal or container networking details not needed for normal library operations.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: `get_book_page_pdf` advertises page-level access but returns the entire PDF. If callers rely on this method for least-privilege document sharing or page-scoped authorization, users can receive the full document instead of a single page, causing unintended data exposure. In a book-serving gateway, this mismatch is more dangerous because it can bypass business rules around partial previews, page-limited access, or copyright-restricted delivery.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The script tells the user to check CALIBRE_DB_PATH in .env, but the actual validation ignores that setting and instead checks a fixed filesystem path. This can cause the service to operate against the wrong library path, fail unexpectedly, or mislead operators into believing configuration changes are being honored when they are not.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: Automatic model download is broader than the core document conversion/search purpose and introduces implicit network and persistence side effects. In unattended or privileged runs, this can fetch unreviewed artifacts and change local state unexpectedly, which is a real security concern even if the feature is convenience-driven.

Vague Triggers

Medium

Confidence: 93% confidence
Finding: The activation text is broad enough to trigger the skill for general discussion of document or e-book conversion, even when the user did not clearly intend to use this Calibre-integrated workflow. That can cause unintended side effects because this skill may invoke a backend service that performs conversion and registers new formats in the library.

Vague Triggers

Medium

Confidence: 91% confidence
Finding: The 'Quando usar' triggers include generic requests like preparing an e-book for devices or mentioning conversion-related terms, which can match ordinary conversation too loosely. In a skill that can modify library state via server-side conversion and registration, ambiguous activation increases the chance of accidental tool use.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The skill describes server responsibilities, including registering a new format and potentially regenerating an existing one when forced, but does not clearly warn the user before the action is taken. This creates a consent and integrity risk because the operation changes persistent library state and may overwrite existing derived formats.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The README encourages optional VirusTotal scanning for uploaded and retrieved ebooks but does not clearly warn that file hashes and potentially file contents may be transmitted to a third-party service. In a system handling local library content, this can create unexpected privacy and data-governance exposure, especially for copyrighted, sensitive, or private documents.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The manual-download endpoint accepts a user-supplied file path, probes the local filesystem with Path.exists()/stat(), computes a hash, stores the path, and then returns that path in API responses and logs. This can disclose sensitive local filesystem structure and existence information to clients, and if exposed to untrusted users it becomes an information disclosure issue that also aids further targeting of host files.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The file-matching endpoint reveals whether a given file hash or path exists in the queue and returns internal metadata including tracked local file_path, queue ID, status, and Calibre linkage. This creates an information disclosure oracle that can leak host filesystem details and operational metadata to callers, especially if an attacker can query arbitrary paths or hashes.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The endpoint forwards user search queries to the external OpenLibrary service when openlibrary_search is enabled, which can disclose potentially sensitive user interests or internal search terms to a third party without explicit user consent or a clear privacy boundary. In this API context, the data flow is real and occurs by default, so this is a genuine privacy/security concern even though it is not a code execution issue.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The authentication helper accepts the API key from the query string via `request.query_params.get("api_key", "")`. Query-string credentials are routinely captured in browser history, reverse-proxy logs, analytics, referrer leaks, and server access logs, making secret exposure much more likely than header-based authentication.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The service downloads from a caller-controlled URL and writes to a caller-influenced filename in the configured download directory without any approval, allowlist, or overwrite protection. In this gateway context, that enables server-side arbitrary file retrieval (including SSRF to internal endpoints if untrusted URLs are accepted upstream) and silent replacement of existing downloaded files, which can lead to data loss or retrieval of sensitive internal content.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: This code uploads local file contents to VirusTotal, which is a third-party external service, without any indication in this component that the user has been explicitly informed or has consented. That creates a real privacy and data-handling risk because sensitive, proprietary, or regulated files may be disclosed outside the system boundary during scanning.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: This path uploads raw in-memory file bytes directly to VirusTotal, again causing third-party disclosure of the full content with no user-facing warning or consent visible in this code. Because this method may be used on transient or generated data, it increases the chance that sensitive material is sent externally without the caller realizing the privacy implications.

Missing User Warnings

Low

Confidence: 85% confidence
Finding: The script logs the full database path supplied via CLI or environment when connecting, which can disclose filesystem layout and potentially sensitive usernames, mount points, or library locations to stderr logs. While this is a read-only utility and the leak is limited to metadata about the host environment, path disclosure can aid reconnaissance or expose privacy-sensitive information in shared logs.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: Running `ollama pull` can consume network resources and modify the host by downloading models, and this can be enabled through environment or flags without an in-band warning/confirmation when the action occurs. In automation contexts, that makes the behavior more dangerous because it violates least surprise and may bypass change-control expectations.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal