Python Code Tester

Security checks across malware telemetry and agentic risk

Overview

This is a coherent Python testing skill, but it can install packages, run generated tests, and direct changes to real project files without clear approval gates.

Install only in a disposable or version-controlled workspace. Review generated tests before running them, avoid running package installation on your main Python environment, and require a manual diff review before applying any suggested source-code changes.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Behavioral ASTexec() Call, eval() Call, Dynamic Import

Findings (15)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: } try: proc = subprocess.run( ["python", "-m", "pytest", str(test_file), "-v", "--tb=short"], capture_output=True, text=True,
Confidence: 96% confidence
Finding: proc = subprocess.run( ["python", "-m", "pytest", str(test_file), "-v", "--tb=short"], capture_output=True, text=True, timeout=60 )

subprocess module call

Medium

Category: Dangerous Code Execution
Content: req_file = SCRIPTS_DIR / "requirements.txt" if req_file.exists(): try: subprocess.run(["pip", "install", "-r", str(req_file)], check=True) return True except subprocess.CalledProcessError: return False
Confidence: 98% confidence
Finding: subprocess.run(["pip", "install", "-r", str(req_file)], check=True)

Tainted flow: 'test_file' from pathlib.Path.read_text (line 554, file read) → subprocess.run (code execution)

Medium

Category: Data Flow
Content: } try: proc = subprocess.run( ["python", "-m", "pytest", str(test_file), "-v", "--tb=short"], capture_output=True, text=True,
Confidence: 94% confidence
Finding: proc = subprocess.run( ["python", "-m", "pytest", str(test_file), "-v", "--tb=short"], capture_output=True, text=True, timeout=60 )

Lp3

Medium

Category: MCP Least Privilege
Confidence: 94% confidence
Finding: The skill directs the agent to read project code, write files under multiple directories, access the network to download datasets, and run tests, yet it declares no permissions or user-facing capability boundaries. This creates a dangerous mismatch: a caller may invoke the skill without understanding it can modify repository contents, execute shell-like actions, or exfiltrate data through network access.

Context-Inappropriate Capability

Medium

Confidence: 93% confidence
Finding: The skill includes an automatic package installation capability that is broader and more dangerous than a typical code-testing function requires. In practice this permits environment mutation and potential arbitrary code execution from dependency installation, making the skill materially more dangerous in this context.

Vague Triggers

Medium

Confidence: 87% confidence
Finding: The skill is activated by broadly phrased user testing requests without clear scope limits, exclusions, or approval gates. Because the workflow includes code search, artifact creation, execution, and code modification, overly broad triggering increases the chance the skill performs risky actions in unintended contexts.

Missing User Warnings

High

Confidence: 97% confidence
Finding: The skill explicitly instructs the agent to save generated artifacts, create release directories, and update actual project files, but provides no explicit warning or consent checkpoint before modifying repository data. In a testing context this is especially risky because a benign request can escalate into persistent changes to source code and test assets, potentially corrupting the repo or introducing unauthorized changes.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The skill instructs downloading external datasets from the network without a user-facing warning, approval step, or restrictions on allowed sources. This can expose sensitive metadata, introduce untrusted content into the environment, and create supply-chain or privacy risks during routine testing flows.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: Automatically installing dependencies without explicit user confirmation is unsafe because it silently changes the execution environment and may run untrusted package code. This is especially risky for an agent skill, where users may not expect package management side effects from a testing action.

Missing User Warnings

High

Confidence: 88% confidence
Finding: The function can overwrite project source files and create backup files without explicit confirmation. Although it is not invoked in the current main flow, the capability exists and could cause destructive or unintended code changes if wired in later or called by other code, especially because the replacement boundaries come from loosely parsed code metadata.

Unpinned Dependencies

Low

Category: Supply Chain
Content: # 代码功能测试 Skill 依赖 # 测试框架 pytest>=7.0.0 # 数据处理 numpy>=1.21.0
Confidence: 92% confidence
Finding: pytest>=7.0.0

Unpinned Dependencies

Low

Category: Supply Chain
Content: pytest>=7.0.0 # 数据处理 numpy>=1.21.0 pandas>=1.3.0
Confidence: 95% confidence
Finding: numpy>=1.21.0

Unpinned Dependencies

Low

Category: Supply Chain
Content: # 数据处理 numpy>=1.21.0 pandas>=1.3.0
Confidence: 93% confidence
Finding: pandas>=1.3.0

Known Vulnerable Dependency: pytest — 1 advisory(ies): CVE-2025-71176 (pytest has vulnerable tmpdir handling)

Low

Category: Supply Chain
Confidence: 67% confidence
Finding: pytest

Known Vulnerable Dependency: numpy — 10 advisory(ies): CVE-2014-1859 (Numpy arbitrary file write via symlink attack); CVE-2021-41495 (NumPy NULL Pointer Dereference); CVE-2021-33430 (NumPy Buffer Overflow (Disputed)) +7 more

Critical

Category: Supply Chain
Confidence: 86% confidence
Finding: numpy

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal