Safe Long-Run Mode GPT5.4

Security checks across malware telemetry and agentic risk

Overview

This is a Markdown-only workflow skill for managing long GPT-5.4-centered tasks, with no hidden code or automatic access to systems.

Install this only if you want a GPT-5.4-centered long-run workflow. Before using it on sensitive work, decide where checkpoint files may be written, keep credentials and private data out of saved notes, limit what subagents receive, and require explicit approval before any account-impacting, public, or external-service write.

SkillSpector

By NVIDIA

Vulnerability Patterns

Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code

Findings (3)

Vague Triggers

Medium

Confidence: 88% confidence
Finding: The description is broad enough to match many ordinary coding, research, documentation, Azure, and multi-agent tasks, which can cause the skill to be invoked outside a narrowly defined operating context. Overbroad activation increases the chance that higher-risk or policy-sensitive tasks are routed into a generic long-run workflow without explicit suitability checks or user confirmation.

Vague Triggers

Medium

Confidence: 90% confidence
Finding: Saying to use the skill whenever GPT-5.4 is the default operating model makes activation depend on environment configuration rather than task risk or appropriateness. That ambiguity can lead to unsafe over-application, especially for tasks needing stronger review, narrower authorization, or model-specific safeguards.

Natural-Language Policy Violations

Medium

Confidence: 86% confidence
Finding: The skill sets a blanket default to a specific model for broad categories of work and only allows deviation in limited cases, which can override normal task-specific judgment. In practice, this may push sensitive or complex tasks into a cheaper workflow that lacks the review depth, policy checks, or quality threshold those tasks require.

VirusTotal

63/63 vendors flagged this skill as clean.

View on VirusTotal