Model Migrate Flagos

Security checks across malware telemetry and agentic risk

Overview

This appears to be a real vLLM model-migration skill, but it grants broad GPU and remote-server control without enough safeguards.

Install only in a dedicated development or GPU test environment. Do not run it on shared or production machines unless you first remove or gate the blanket GPU kill commands, restrict remote SSH targets, avoid root/passwordless SSH where possible, and only serve or benchmark model directories you trust because model loading enables remote code execution.

SkillSpector

By NVIDIA

Vulnerability Patterns

Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Behavioral ASTexec() Call, eval() Call, Dynamic Import
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (17)

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: The README explicitly instructs users to manage a remote ground-truth server over SSH, Docker, and Conda. That extends the skill from local code migration into remote system access and process control, which increases attack surface and could lead to unintended changes on external machines if followed blindly.

Context-Inappropriate Capability

High

Confidence: 98% confidence
Finding: The troubleshooting section recommends killing all GPU compute processes with `kill -9` based on `nvidia-smi` output, which can terminate unrelated workloads on a shared system. In a migration skill, this exceeds the minimum scope needed for model porting and creates a denial-of-service risk against other users' jobs or critical services.

Context-Inappropriate Capability

High

Confidence: 99% confidence
Finding: The skill explicitly instructs the agent to identify GPU usage and forcibly kill occupying processes with kill -9, regardless of ownership or relation to the current task. This is dangerous because it enables destructive interference with unrelated workloads, causing denial of service, data loss, and disruption of other users' jobs in a shared environment.

Context-Inappropriate Capability

Medium

Confidence: 84% confidence
Finding: The rule provides a recovery path that reinstalls vLLM into the environment outside the plugin directory, which changes shared dependencies and the runtime state beyond the declared project boundary. Even if framed as restoration, this grants the skill authority to mutate the installed environment and can break reproducibility or other projects relying on that installation.

Intent-Code Divergence

Medium

Confidence: 90% confidence
Finding: The document states that fixes should remain inside the plugin directory, but later permits modifying or reinstalling the installed vLLM environment when certain failures occur. This contradiction weakens safety boundaries and may cause the agent to take broader system-modifying actions than the user expects.

Context-Inappropriate Capability

High

Confidence: 99% confidence
Finding: The procedure explicitly instructs killing all GPU compute-app PIDs via `kill -9` based only on available memory, which can terminate unrelated users' jobs or critical services on the host. For a model-migration skill, this is an unnecessary destructive host-management action that exceeds the stated purpose and can cause denial of service, data loss, or disruption of shared infrastructure.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: The skill adds remote SSH-based server management, including starting services on another machine, as part of a migration workflow. This expands the skill's capabilities beyond local code migration into remote administration, increasing the chance of misuse, accidental access to the wrong host, or execution against sensitive infrastructure without adequate scoping or consent controls.

Context-Inappropriate Capability

Medium

Confidence: 97% confidence
Finding: The script passes `--trust-remote-code`, which allows model-supplied code to execute during loading. In a migration/benchmark workflow, the model name is provided externally and the script offers no provenance checks, sandboxing, or warning, so benchmarking a malicious or compromised model repository could lead to arbitrary code execution on the host.

Context-Inappropriate Capability

Medium

Confidence: 97% confidence
Finding: The script starts vLLM with --trust-remote-code, which allows model repository code to be imported and executed during model loading. Because the model path is user-selected and this helper is meant for migration/verification workflows, a malicious or compromised model artifact can run arbitrary code on the host with the permissions of the operator.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The README provides concrete SSH setup and remote management steps without a clear safety warning that these actions grant persistent access and may start/stop services on another machine. In an agent-skill context, omission of that warning is risky because users may authorize remote actions without understanding their scope.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The documented GPU process termination command is destructive and is presented as an automatic fix path without an explicit warning, ownership check, or confirmation step. That makes accidental disruption much more likely, especially if an agent follows the procedure literally on a multi-tenant or production machine.

Missing User Warnings

High

Confidence: 99% confidence
Finding: The instructions normalize forceful termination of GPU-using processes without warning about destructive consequences or requiring confirmation. In a shared compute environment, this can abruptly kill unrelated training or inference jobs, leading to service interruption, corrupted outputs, and loss of unsaved work.

Missing User Warnings

High

Confidence: 99% confidence
Finding: The force-kill instruction lacks any explicit warning that it may terminate other workloads using the GPU, especially on shared machines. Because it uses a broad query of compute processes and unconditional `kill -9`, it can abruptly interrupt unrelated jobs and bypass safer operational safeguards.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The procedure references a concrete SSH private-key path and directs remote commands without providing adequate privacy and security warnings about credential use, host trust, and command execution scope. In a skill context, normalizing direct use of local private keys and remote shell access can lead to accidental disclosure, misuse of credentials, or unsafe execution on unintended systems.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The shell script silently enables remote model code execution without any user-facing disclosure or confirmation. That makes the behavior surprising and unsafe in an automation skill, increasing the chance an operator runs untrusted code under the mistaken assumption that this is only a local throughput benchmark.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The script constructs remote shell commands from configuration values and a model argument, then executes them over SSH inside nested bash/docker/conda contexts with insufficient quoting and escaping. An attacker who can modify the config file or supply a crafted model name could inject arbitrary shell syntax, leading to command execution on the remote host or in the container.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The helper script enables remote code execution behavior implicitly, without any dedicated warning, confirmation, or guardrail beyond a generic startup comment. In a migration skill context, users may run this as a routine verification step and unknowingly execute untrusted model-side code, increasing the chance of accidental compromise.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal