Python Code Tester

Security checks across malware telemetry and agentic risk

Overview

This is a coherent Python testing skill, but it can install packages, run generated tests, and direct changes to real project files without clear approval gates.

Install only in a disposable or version-controlled workspace. Review generated tests before running them, avoid running package installation on your main Python environment, and require a manual diff review before applying any suggested source-code changes.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • Behavioral ASTexec() Call, eval() Call, Dynamic Import
Findings (15)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
}
    
    try:
        proc = subprocess.run(
            ["python", "-m", "pytest", str(test_file), "-v", "--tb=short"],
            capture_output=True,
            text=True,
Confidence
96% confidence
Finding
proc = subprocess.run( ["python", "-m", "pytest", str(test_file), "-v", "--tb=short"], capture_output=True, text=True, timeout=60 )

subprocess module call

Medium
Category
Dangerous Code Execution
Content
req_file = SCRIPTS_DIR / "requirements.txt"
    if req_file.exists():
        try:
            subprocess.run(["pip", "install", "-r", str(req_file)], check=True)
            return True
        except subprocess.CalledProcessError:
            return False
Confidence
98% confidence
Finding
subprocess.run(["pip", "install", "-r", str(req_file)], check=True)

Tainted flow: 'test_file' from pathlib.Path.read_text (line 554, file read) → subprocess.run (code execution)

Medium
Category
Data Flow
Content
}
    
    try:
        proc = subprocess.run(
            ["python", "-m", "pytest", str(test_file), "-v", "--tb=short"],
            capture_output=True,
            text=True,
Confidence
94% confidence
Finding
proc = subprocess.run( ["python", "-m", "pytest", str(test_file), "-v", "--tb=short"], capture_output=True, text=True, timeout=60 )

Lp3

Medium
Category
MCP Least Privilege
Confidence
94% confidence
Finding
The skill directs the agent to read project code, write files under multiple directories, access the network to download datasets, and run tests, yet it declares no permissions or user-facing capability boundaries. This creates a dangerous mismatch: a caller may invoke the skill without understanding it can modify repository contents, execute shell-like actions, or exfiltrate data through network access.

Context-Inappropriate Capability

Medium
Confidence
93% confidence
Finding
The skill includes an automatic package installation capability that is broader and more dangerous than a typical code-testing function requires. In practice this permits environment mutation and potential arbitrary code execution from dependency installation, making the skill materially more dangerous in this context.

Vague Triggers

Medium
Confidence
87% confidence
Finding
The skill is activated by broadly phrased user testing requests without clear scope limits, exclusions, or approval gates. Because the workflow includes code search, artifact creation, execution, and code modification, overly broad triggering increases the chance the skill performs risky actions in unintended contexts.

Missing User Warnings

High
Confidence
97% confidence
Finding
The skill explicitly instructs the agent to save generated artifacts, create release directories, and update actual project files, but provides no explicit warning or consent checkpoint before modifying repository data. In a testing context this is especially risky because a benign request can escalate into persistent changes to source code and test assets, potentially corrupting the repo or introducing unauthorized changes.

Missing User Warnings

Medium
Confidence
92% confidence
Finding
The skill instructs downloading external datasets from the network without a user-facing warning, approval step, or restrictions on allowed sources. This can expose sensitive metadata, introduce untrusted content into the environment, and create supply-chain or privacy risks during routine testing flows.

Missing User Warnings

Medium
Confidence
97% confidence
Finding
Automatically installing dependencies without explicit user confirmation is unsafe because it silently changes the execution environment and may run untrusted package code. This is especially risky for an agent skill, where users may not expect package management side effects from a testing action.

Missing User Warnings

High
Confidence
88% confidence
Finding
The function can overwrite project source files and create backup files without explicit confirmation. Although it is not invoked in the current main flow, the capability exists and could cause destructive or unintended code changes if wired in later or called by other code, especially because the replacement boundaries come from loosely parsed code metadata.

Unpinned Dependencies

Low
Category
Supply Chain
Content
# 代码功能测试 Skill 依赖

# 测试框架
pytest>=7.0.0

# 数据处理
numpy>=1.21.0
Confidence
92% confidence
Finding
pytest>=7.0.0

Unpinned Dependencies

Low
Category
Supply Chain
Content
pytest>=7.0.0

# 数据处理
numpy>=1.21.0
pandas>=1.3.0
Confidence
95% confidence
Finding
numpy>=1.21.0

Unpinned Dependencies

Low
Category
Supply Chain
Content
# 数据处理
numpy>=1.21.0
pandas>=1.3.0
Confidence
93% confidence
Finding
pandas>=1.3.0

Known Vulnerable Dependency: pytest — 1 advisory(ies): CVE-2025-71176 (pytest has vulnerable tmpdir handling)

Low
Category
Supply Chain
Confidence
67% confidence
Finding
pytest

Known Vulnerable Dependency: numpy — 10 advisory(ies): CVE-2014-1859 (Numpy arbitrary file write via symlink attack); CVE-2021-41495 (NumPy NULL Pointer Dereference); CVE-2021-33430 (NumPy Buffer Overflow (Disputed)) +7 more

Critical
Category
Supply Chain
Confidence
86% confidence
Finding
numpy

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal