Life Science Database Query

Security checks across malware telemetry and agentic risk

Overview

This looks like a legitimate life-science database skill, but its helper scripts are broader than advertised and can contact arbitrary URLs and write raw responses to arbitrary local paths.

Install only if you are comfortable with a broad networked research helper. Use it with public, non-sensitive research inputs, avoid passing Authorization/Cookie headers or private tokens, and do not enable save_raw or raw_output_path unless you choose a safe temporary location and understand that full API responses will remain on disk.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Taint TrackingDirect Taint Flow, Variable-Mediated Taint Flow, Credential Exfiltration Chain
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration

Findings (111)

Tainted flow: 'raw_text' from requests.post (line 285, network input) → pathlib.Path.write_text (file write)

Medium

Category: Data Flow
Content: ) path = Path(config["raw_output_path"] or "/tmp/opentargets-associated-diseases.json") path.parent.mkdir(parents=True, exist_ok=True) path.write_text(raw_text, encoding="utf-8") raw_output_path = str(path) if disease_filter and not filtered_rows:
Confidence: 93% confidence
Finding: path.write_text(raw_text, encoding="utf-8")

Tainted flow: 'raw_text' from requests.post (line 109, network input) → pathlib.Path.write_text (file write)

Medium

Category: Data Flow
Content: raw_text = json.dumps(data, indent=2) path = Path(config["raw_output_path"] or "/tmp/opentargets-graphql.json") path.parent.mkdir(parents=True, exist_ok=True) path.write_text(raw_text, encoding="utf-8") raw_output_path = str(path) if "errors" in data:
Confidence: 92% confidence
Finding: path.write_text(raw_text, encoding="utf-8")

Description-Behavior Mismatch

High

Confidence: 97% confidence
Finding: The client accepts a user-controlled base_url and also allows absolute URLs in path, so it can send requests to essentially any HTTP(S) destination rather than being constrained to approved life-science data sources. In an agent context this creates an SSRF-style capability and a generic network pivot, which is more dangerous because the skill is presented as a domain-specific database query tool but actually exposes broad outbound request functionality.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: The save_raw feature writes attacker-controlled response content to an attacker-influenced filesystem path, enabling arbitrary file creation or overwrite wherever the process has permissions. Even without code execution, this can expose sensitive data, clobber local files, or plant misleading artifacts on disk; in an agent environment it is unjustified for a database lookup helper.

Context-Inappropriate Capability

Medium

Confidence: 97% confidence
Finding: This script is a fully generic HTTP client: it accepts arbitrary base URLs, absolute paths, methods, headers, query parameters, and request bodies, then performs the request without any allowlist or scope restriction. In a life-science database skill, that exceeds the stated capability and can be abused for SSRF, access to unintended internal or cloud metadata endpoints, or exfiltration via attacker-chosen headers and destinations.

Context-Inappropriate Capability

Medium

Confidence: 98% confidence
Finding: The `save_raw` feature writes attacker-influenced remote response data to a caller-specified filesystem path using `Path(raw_output_path)` with no directory restriction or filename sanitization. That enables arbitrary file write within the process permissions, potentially overwriting application files, planting content in sensitive locations, or filling disk with large responses.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The script accepts a user-controlled raw_output_path and, when save_raw is enabled, writes API response content to that path with no directory restriction, allowlist, or path safety enforcement. In an agent/runtime context, this creates an arbitrary local file write primitive that exceeds the stated purpose of a database-query skill and could overwrite files, plant data in sensitive locations, or interfere with other components depending on filesystem permissions.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: This finding is substantively the same issue: the skill exposes arbitrary file write capability unrelated to its advertised read/query behavior. Even though the written content is JSON from a remote API rather than attacker-crafted shell code, allowing users or upstream agents to choose any path can still enable unauthorized persistence, overwriting of local files, and abuse of trusted filesystem locations.

Context-Inappropriate Capability

Medium

Confidence: 98% confidence
Finding: The helper allows absolute URLs in `path`, which bypasses the provided `base_url` and turns the skill into a generic outbound HTTP client. In an agent context, that creates SSRF-style capability and policy bypass risk because the tool can contact arbitrary internal or unrelated external services rather than only the intended life-science databases.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: Supporting arbitrary POST requests with caller-controlled headers and bodies gives this script broad API-client functionality far beyond a read-only public database lookup tool. In practice, this can be abused to send authenticated-looking requests, trigger state-changing operations on third-party services, or exfiltrate data to attacker-controlled endpoints when combined with the unrestricted URL handling.

Context-Inappropriate Capability

Medium

Confidence: 97% confidence
Finding: The client constructs the destination URL from caller-supplied `base_url` and `path`, and explicitly allows fully qualified `http://`/`https://` paths to override the base. In a skill meant for BioStudies/ArrayExpress lookup, this creates a generic outbound HTTP capability that can be repurposed for SSRF, data exfiltration, or access to unintended services instead of being constrained to the expected public databases.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: The code permits caller-controlled POST requests, arbitrary headers, and arbitrary JSON/form bodies through `session.request`, turning a read-oriented database helper into a general-purpose HTTP client. In an agent environment, that broadens the attack surface substantially by enabling authenticated-style header spoofing, unintended state-changing requests, and transmission of sensitive data to attacker-chosen endpoints.

Context-Inappropriate Capability

Low

Confidence: 89% confidence
Finding: `save_raw` and `raw_output_path` let the caller write arbitrary response content to a caller-influenced filesystem path. Even though this is framed as output capture, it adds an unnecessary local file-write primitive to a database query skill, which can enable persistence of sensitive fetched data, overwrite files writable by the process, or create artifacts for later misuse.

Context-Inappropriate Capability

Medium

Confidence: 97% confidence
Finding: `_build_url` allows `path` to be a fully qualified `http://` or `https://` URL, which bypasses the supplied `base_url` entirely and turns this database-specific helper into a generic outbound HTTP client. In an agent environment, that can enable SSRF-style access to arbitrary external or internal hosts, data exfiltration to attacker-controlled endpoints, or misuse of inherited headers meant only for trusted life-science APIs.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: When `save_raw` is enabled, `_save_raw_output` writes arbitrary network response bodies to a caller-controlled filesystem path, with parent-directory creation. That allows an untrusted caller to persist potentially sensitive or attacker-crafted content anywhere the process can write, which exceeds the stated query-only purpose and can clobber files, stage data for later use, or create local disclosure risks.

Context-Inappropriate Capability

Medium

Confidence: 97% confidence
Finding: The client accepts an arbitrary base_url and also allows an absolute URL in path, which makes it a generic outbound HTTP client rather than a cellxgene-scoped integration. In an agent environment, this enables SSRF-style access to unintended internal or external endpoints, data exfiltration, and bypass of skill-purpose boundaries if an attacker can influence inputs.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: Caller-supplied headers are copied directly into the session, allowing arbitrary Authorization, Cookie, or other sensitive headers to be forwarded to any target chosen by the caller. Combined with the generic request capability, this can leak credentials to attacker-controlled hosts or misuse privileged tokens against unrelated services.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: The code writes remote response content to a caller-controlled local path, creating an arbitrary file write primitive within the permissions of the running process. An attacker could overwrite application files, place sensitive data in readable locations, or fill disk/storage with large responses, especially because the source URL is also caller-controlled.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The helper accepts an arbitrary base URL and also allows absolute URLs in the path, turning a ChEBI-oriented skill into a general outbound HTTP client. In an agent environment this can be abused for SSRF-style access to unintended internal or sensitive endpoints, or to exfiltrate data to attacker-controlled hosts, which exceeds the documented public-database lookup purpose.

Context-Inappropriate Capability

Medium

Confidence: 93% confidence
Finding: The raw response saver writes attacker-influenced content to an arbitrary filesystem path supplied in input. This can overwrite files accessible to the process, persist sensitive remote data locally, and create a stepping stone for data leakage or operational impact unrelated to the skill's database-query role.

Context-Inappropriate Capability

Medium

Confidence: 96% confidence
Finding: The client builds a URL from user-controlled `base_url` and `path`, and even accepts fully qualified URLs in `path`, enabling arbitrary outbound HTTP requests rather than restricting access to approved life-science data sources. In an agent context, this creates a server-side request capability that can be abused for data exfiltration, unexpected third-party interaction, or access to internal network resources if the runtime has network reachability.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: The helper writes raw remote responses to a caller-supplied filesystem path or a predictable `/tmp` location, which exceeds the stated lookup purpose and can persist sensitive or unexpected content locally. Because the path is attacker-controlled, this can overwrite files accessible to the process or create artifacts that later processes may consume, increasing risk beyond transient querying.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: The client accepts an arbitrary base URL and also permits absolute URLs in the path, making it a general-purpose HTTP requester rather than a narrowly scoped ontology lookup tool. In an agent context, this can be abused for SSRF, access to unintended internal or metadata endpoints, or exfiltration to attacker-controlled hosts, which is more dangerous than the advertised life-science database role suggests.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: Supporting POST plus arbitrary JSON/form bodies expands this component from passive database retrieval into a generic request primitive capable of sending attacker-controlled payloads to remote services. Even if intended for API flexibility, in an agent skill this materially increases misuse potential, including interacting with non-read-only endpoints or assisting data exfiltration/workflow abuse.

Context-Inappropriate Capability

Medium

Confidence: 97% confidence
Finding: The code can write untrusted remote response content to an arbitrary filesystem path supplied in input. This creates a local file write primitive that could overwrite sensitive files, place attacker-controlled content in predictable locations, or persist data on disk beyond user expectations.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal