OneScience-Skills

Security checks across malware telemetry and agentic risk

Overview

This OneScience skill is mostly purpose-aligned for HPC coding and testing, but it gives agents broad remote-install, SSH-config-reading, file-upload, and SLURM job-submission authority without enough consent and safety boundaries.

Install only if you intend this skill to operate on OneScience projects, remote DCU hosts, SLURM/SCnet jobs, and generated code. Before use, require the agent to ask before reading ~/.ssh/config, confirm the exact SSH host and remote commands, avoid or manually approve rm -rf onescience, review every file before upload to SCnet, and require a final confirmation before any SLURM job submission or file overwrite.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (16)

Description-Behavior Mismatch

Medium

Confidence: 93% confidence
Finding: The document is framed as a testing workflow, but it also instructs the agent to implement code fixes after analyzing failures. This expands the skill from read-only validation into code modification, increasing the chance of unintended or unauthorized changes to user code under the guise of testing.

Intent-Code Divergence

High

Confidence: 97% confidence
Finding: The file contains contradictory instructions: it says not to modify code under test, then later instructs the agent to perform fixes. This kind of policy conflict is dangerous because an autonomous agent may choose the more permissive path and alter source code without clear authorization or proper guardrails.

Description-Behavior Mismatch

Medium

Confidence: 96% confidence
Finding: The skill explicitly instructs the agent to read the user's local ~/.ssh/config before doing any installation work. That accesses sensitive local connection metadata unrelated to merely selecting an installation domain, and could expose hostnames, usernames, ProxyJump settings, and internal infrastructure details to the agent or downstream logs.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The manifest describes a domain-based installation assistant, but the skill escalates into remote host access and remote command execution. This mismatch increases the chance that users invoke the skill without realizing it will connect to remote systems and run installation commands, creating integrity and operational risk on remote infrastructure.

Vague Triggers

Medium

Confidence: 91% confidence
Finding: The trigger keywords are overly broad, including common terms like 'data' and 'dataset', which can cause the skill to classify ordinary discussion as an actionable data-loading request. In an orchestrator that can chain into code generation or job execution, this raises the risk of unintended skill activation and downstream high-impact actions without sufficiently specific user intent.

Vague Triggers

High

Confidence: 97% confidence
Finding: The job-submission trigger is described too broadly for a high-impact operation: phrases like 'run code' or 'execute task' can appear in benign conversation but may cause the orchestrator to submit SLURM jobs. Because this skill manages runtime submission, an accidental or attacker-induced prompt match could consume compute resources, run unsafe workloads, or affect shared infrastructure.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The skill prominently supports submitting SLURM jobs and performing installation-related actions, but it does not consistently warn users about operational, cost, or system-side effects before those actions occur. In a research/HPC context, silent execution can lead to unintended resource consumption, policy violations, or modifications on remote systems, making the lack of explicit risk disclosure materially dangerous.

Missing User Warnings

High

Confidence: 96% confidence
Finding: The environment-install example instructs the skill to read SSH configuration and establish a remote DCU connection without any privacy notice, host verification guidance, or explicit consent checkpoint. Accessing SSH-related material and initiating remote installation are high-impact operations that could expose sensitive infrastructure details or change remote systems unexpectedly if mis-triggered or socially engineered.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The skill instructs the agent to save all generated files into the current `case` directory by default, but it does not require explicit user confirmation, safe path validation, or overwrite checks. In an agentic coding context, this can lead to unintended file creation or clobbering of existing case artifacts, especially if the working directory contains valuable inputs, outputs, or prior experiment state.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: The guide instructs the agent to save generated files to a default case directory when the user has not specified a path, which can cause implicit filesystem modification without explicit user consent. In an agent skill that automatically generates and writes code, this increases the risk of unintended file creation or overwriting in the workspace, especially if the agent is broadly trusted to act on repository contents.

Natural-Language Policy Violations

Medium

Confidence: 91% confidence
Finding: The workflow hard-codes Chinese as the interaction language ('你是一名基于 OneScience 写代码的工程智能体') and all required output templates are Chinese-only, without asking the user for a preferred language. This can cause misunderstandings, unsafe assumptions, or failed confirmation in a coding workflow where users must validate specifications before code generation, increasing the chance of incorrect or insecure implementations due to language mismatch.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The workflow directs the agent to upload the model, test scripts, and 'other necessary files' to a remote SCnet platform without any requirement to warn about secrets, proprietary code, datasets, or credentials. In this context, the broad file collection and transfer behavior creates a real risk of sensitive data exfiltration to an external system.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The skill requires reading the user's local SSH config without a clear privacy warning or explicit informed consent. SSH config often contains sensitive environment details, so silently accessing it exceeds user expectations for an installer and can leak internal topology or account information.

Missing User Warnings

High

Confidence: 99% confidence
Finding: The skill includes an unconditional destructive command, rm -rf onescience, on the remote host without warning, confirmation, or path validation. If executed in an unexpected directory or against valuable user data, it can delete existing work and cause irreversible data loss on the remote system.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The skill explicitly states it will automatically generate and submit SLURM jobs, which is a system-affecting action that can consume shared cluster resources, trigger code execution, and create operational cost or disruption if done without an explicit user confirmation step. The surrounding content emphasizes automation and fixed-template submission, but does not require a clear consent gate before submission, making accidental or unintended job dispatch plausible.

Ssd 3

Medium

Confidence: 97% confidence
Finding: Reading ~/.ssh/config to discover remote hosts exposes sensitive local configuration data beyond what is necessary for a domain-selection installer. In this skill context, the behavior is more dangerous because it is mandated as the first step and paired with automatic remote execution, increasing both privacy and operational risk.

VirusTotal

67/67 vendors flagged this skill as clean.

View on VirusTotal