Release20260324

Security checks across malware telemetry and agentic risk

Overview

This literature-analysis skill mostly matches its stated purpose, but it needs review because its graph editor exposes unauthenticated file-changing APIs and some LLM features can send PDF or graph content to third parties.

Review before installing. Run the graph server only on trusted machines and networks, preferably after changing it to bind to 127.0.0.1 and removing wildcard CORS. Avoid using private PDFs, unpublished manuscripts, confidential reading lists, or sensitive graph data with LLM/API features unless you are comfortable with that content leaving your machine. Keep backups of graph JSON files before using serve, remove-seed, or remove-paper, and avoid passing Zotero API keys directly on the command line.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (23)

Context-Inappropriate Capability

Medium

Confidence: 97% confidence
Finding: The handler enables unauthenticated access and sets Access-Control-Allow-Origin to '*' for GET/POST/OPTIONS, while the service includes state-changing endpoints like add, convert, delete, enrich, and save. Combined with no auth checks, any website a user visits could issue cross-origin requests to the server and modify the graph if the service is reachable, making this a real CSRF-like/local service abuse issue.

Intent-Code Divergence

Medium

Confidence: 98% confidence
Finding: The startup logs tell users the server is available at localhost, but the actual bind address is 0.0.0.0, exposing the service on all network interfaces. Because the API is unauthenticated and can mutate local graph data, this discrepancy can unintentionally expose the service to other machines on the network and increases the attack surface substantially.

Description-Behavior Mismatch

Medium

Confidence: 89% confidence
Finding: This export module goes beyond static HTML generation by embedding browser-side logic for live LLM API calls and API-key handling. That materially expands the attack surface of an exported artifact: opening the HTML can trigger sensitive-data handling and outbound requests that users may not expect from an 'export' feature.

Intent-Code Divergence

Medium

Confidence: 96% confidence
Finding: The docstring claims API keys are never embedded in the HTML output, but the generated page stores user-entered keys in sessionStorage. While not hardcoded into the file at export time, this still persists credentials in browser-accessible storage where any script running in the page origin can read them, and the misleading claim increases the chance of unsafe use.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The README states that uploaded PDFs and graph mutations are saved immediately, but it does not prominently warn users about the privacy and data-retention implications of that persistence. In a research tool that handles local documents and bibliographic data, users may reasonably assume temporary processing; silent persistence increases the risk of exposing sensitive unpublished papers, annotations, or local research artifacts to other local users, backups, or later unintended sharing.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The README encourages use of multiple external academic APIs and 20+ LLM providers, but does not clearly warn that queries, titles, abstracts, PDFs, graph contents, or summaries may be transmitted to third parties. For an agent skill used in research workflows, this can leak sensitive topics, unpublished manuscripts, proprietary PDFs, or internal literature maps to external services without informed user consent.

Vague Triggers

Medium

Confidence: 86% confidence
Finding: The invocation description is broad enough to activate on generic research-related requests, which can cause the agent to invoke a powerful skill with Bash, file write, network access, and browser-serving workflows more often than necessary. Over-broad triggering increases the chance of unnecessary external calls, file creation, or side effects in contexts where the user only wanted lightweight advice or discussion.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The documentation describes persistent graph mutations, overwriting files, and deletion operations ('serve', 'remove-seed', 'remove-paper') without requiring confirmation or prominently warning that changes are immediately written to disk. In an agent setting, this creates a real risk of unintended destructive file modifications from ambiguous user prompts or autonomous tool use.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The code extracts substantial text from a local PDF and sends it to an LLM service via llm_chat() without any consent gate, disclosure, or data-classification check. Academic PDFs can contain unpublished manuscripts, licensed content, reviewer annotations, or other sensitive material, so silently transmitting text to a third-party model provider creates a real privacy and data-handling risk.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: Accepting a Zotero API key as a command-line argument can expose the secret through shell history, process listings, job control tools, audit logs, and CI command logs. In this academic-research context, users are likely to run the CLI on shared workstations, remote servers, or notebooks, which makes inadvertent credential disclosure more plausible.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The LLM summary path transmits titles, authors, years, citation counts, and up to 300 characters of each abstract to an external LLM provider whenever one is configured. In an academic literature tool, these inputs may include unpublished, licensed, institution-specific, or user-curated research data, and this file contains no consent gate, redaction, or provider-scope restriction before exfiltrating that content.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: This is a real privacy/data-handling vulnerability: the code sends up to 3000 characters of first-page PDF text to an external LLM service for metadata extraction without any consent gate, disclosure, or data-classification check. Academic PDFs can contain unpublished manuscripts, author emails, affiliations, submission notes, or other sensitive content, so this creates an unintended exfiltration path to a third-party model provider.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The code extracts up to 12,000 characters of PDF text and sends it to an LLM service via llm_chat() to recover references, but there is no consent gate, redaction, or user-facing disclosure at the call site. Because uploaded PDFs may contain unpublished manuscripts, licensed content, reviewer comments, or other sensitive text, this creates an unintended data exfiltration path to a third-party model provider.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The code sends paper metadata and the user's API key to third-party LLM endpoints directly from the browser, but only provides limited UX hints rather than a clear consent flow describing what data leaves the local file. In an academic-literature tool, abstracts, URLs, and metadata may still be confidential or unpublished, so silent or poorly explained transmission creates a real privacy and data-governance risk.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The llm_chat function sends arbitrary prompt content to whichever external provider is detected, but the code contains no built-in consent gate, warning, or data-classification check before transmitting user content off-system. In a literature tool, prompts may include unpublished manuscripts, notes, PDFs, or sensitive research data, making silent exfiltration to third-party AI services a real privacy and compliance risk.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: When GROBID is enabled, the code sends the full PDF file to an HTTP service on localhost without any explicit consent, warning, or trust boundary check at the point of transmission. Even though the endpoint is local, localhost services may be containerized, proxied, or reachable by other local users/processes, so sensitive paper contents could be exposed unexpectedly in a research workflow that may handle unpublished or proprietary documents.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: This code sends bibliographic metadata derived from user-supplied or PDF-extracted references to the CrossRef API, including title and potentially author names, without any indication here of consent, disclosure, or an offline-only option. Even though the data is academic metadata rather than secrets, uploaded PDFs can contain unpublished, proprietary, or sensitive reading lists, so transmitting them to third parties creates a real privacy and data-governance risk.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: This code transmits reference titles, and optionally publication year, to the OpenAlex API for resolution without any visible notice or permission check in this component. In a research assistant context, users may upload private manuscripts or confidential citation lists, so silent third-party disclosure can leak sensitive research interests or unpublished work.

Unpinned Dependencies

Low

Category: Supply Chain
Content: httpx pymupdf scholarly
Confidence: 96% confidence
Finding: httpx

Unpinned Dependencies

Low

Category: Supply Chain
Content: httpx pymupdf scholarly
Confidence: 95% confidence
Finding: pymupdf

Unpinned Dependencies

Low

Category: Supply Chain
Content: httpx pymupdf scholarly
Confidence: 92% confidence
Finding: scholarly

Known Vulnerable Dependency: httpx — 2 advisory(ies): CVE-2021-41945 (Improper Input Validation in httpx); CVE-2021-41945 (Encode OSS httpx <=1.0.0.beta0 is affected by improper input validation in `http)

Critical

Category: Supply Chain
Confidence: 88% confidence
Finding: httpx

Known Vulnerable Dependency: pymupdf — 1 advisory(ies): CVE-2026-3029 (PyMuPDF has a path traversal in _main_.py)

Low

Category: Supply Chain
Confidence: 82% confidence
Finding: pymupdf

VirusTotal

67/67 vendors flagged this skill as clean.

View on VirusTotal