Codex Skill

Security checks across malware telemetry and agentic risk

Overview

This skill is a coherent Codex automation guide, but it normalizes unattended no-sandbox agents that can change code, push branches, open PRs, post comments, and run for long periods.

Install only if you deliberately want a hands-off Codex agent workflow. Use it in a disposable container or isolated worktree, avoid approval/sandbox bypass on sensitive machines, review diffs before commit or push, and require explicit confirmation before PR creation, comments, notifications, cleanup, or any action using repository credentials.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (6)

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The manifest frames the skill as a simple Codex implementation helper, but the body expands into an autonomous workflow that performs branch creation, dependency installation, git push, PR creation, review orchestration, notifications, retries, and cleanup. This scope mismatch is dangerous because it can cause the skill to be invoked for ordinary coding requests while actually enabling broad unattended repository and system actions.

Context-Inappropriate Capability

Medium

Confidence: 88% confidence
Finding: The skill introduces cross-model review and external tool invocation beyond the stated purpose of using Codex to implement tasks. That broadens the trust boundary by sending code diffs to other tools and models, which may violate least privilege, increase data exposure, and trigger unintended external actions not implied by the skill description.

Intent-Code Divergence

Medium

Confidence: 98% confidence
Finding: The documentation normalizes `--dangerously-bypass-approvals-and-sandbox` as safe or recommended in containers/VMs, despite the flag explicitly disabling key safeguards. This is dangerous because it encourages operators to run untrusted or high-impact automation with unrestricted system and network access under a false sense of safety.

Vague Triggers

Medium

Confidence: 91% confidence
Finding: The activation condition is broad enough to match many normal coding requests involving Codex, GPT-5, or implementation work. Because the skill contains powerful autonomous behaviors, overbroad triggering increases the chance that it is selected in situations where the user did not intend unattended edits, git operations, or external actions.

Missing User Warnings

High

Confidence: 97% confidence
Finding: The manifest explicitly promotes hands-off, non-interactive automation without approval prompts for code-changing tasks, but it does not prominently warn about the risks of unattended edits and repository actions. In context, this can lead to silent file changes, branch operations, and downstream publishing actions without a clear informed-consent boundary.

Missing User Warnings

High

Confidence: 99% confidence
Finding: These examples recommend running Codex in interactive automation mode with `--dangerously-bypass-approvals-and-sandbox`, then instruct the agent to commit, push, create a PR, and send notifications. This combines unrestricted execution, repository mutation, and network-capable actions, making accidental or malicious prompt effects far more damaging than a normal coding helper.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal