deterministic-calc-skill

Security checks across malware telemetry and agentic risk

Overview

This calculator skill includes useful math helpers, but it also gives the agent broad local code, shell, and file access that needs manual review before installation.

Treat this as a high-authority local execution helper, not just a calculator. Install only if you intend to let the agent run local Python, shell commands, and arbitrary file reads/writes; prefer safe_eval() for untrusted math and isolate or disable calculate(), run_python(), run_shell(), and write_file() unless you have a real sandbox and explicit approval workflow.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Behavioral ASTexec() Call, eval() Call, Dynamic Import
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration

Findings (15)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: dict: {"success": bool, "stdout": str, "stderr": str, "exit_code": int} """ try: result = subprocess.run( ["python3", "-c", code], capture_output=True, text=True,
Confidence: 98% confidence
Finding: result = subprocess.run( ["python3", "-c", code], capture_output=True, text=True, timeout=timeout )

subprocess module call

Medium

Category: Dangerous Code Execution
Content: dict: {"success": bool, "stdout": str, "stderr": str, "exit_code": int} """ try: result = subprocess.run( command, shell=shell, capture_output=True,
Confidence: 100% confidence
Finding: result = subprocess.run( command, shell=shell, capture_output=True, text=True, timeout=timeout )

Description-Behavior Mismatch

High

Confidence: 96% confidence
Finding: The README markets the package as a deterministic calculation skill, but it documents capabilities for arbitrary Python execution, shell command execution, and file access. This mismatch can cause integrators or agents to trust and enable a much broader tool than intended, creating command-execution and data-access risk if untrusted input reaches those functions.

Context-Inappropriate Capability

High

Confidence: 98% confidence
Finding: Arbitrary shell execution is highly dangerous in a skill framed as a calculator because consumers may not expect operating-system command execution. If attacker-controlled or model-generated input is passed to run_shell, it can execute destructive commands, exfiltrate data, or alter the host environment.

Context-Inappropriate Capability

High

Confidence: 97% confidence
Finding: Arbitrary Python execution exceeds the stated deterministic-calculation purpose and enables full code execution on the host. In an agent context, model-produced or user-supplied code could read secrets, modify files, make network requests, or chain into broader compromise.

Context-Inappropriate Capability

High

Confidence: 98% confidence
Finding: The skill is presented as a deterministic calculator, but it exposes a generic `run_shell(command)` capability that can execute arbitrary system commands. This greatly expands the attack surface beyond the stated purpose and could enable filesystem access, data exfiltration, environment discovery, or destructive command execution if an agent passes through untrusted input.

Context-Inappropriate Capability

High

Confidence: 97% confidence
Finding: Exposing `run_python(code)` allows arbitrary code execution, which is not necessary for a deterministic calculator and undermines the safety boundary implied by the skill's purpose. An attacker or prompt-injected workflow could use this to read local files, access secrets, make network calls, or execute arbitrary logic under the agent's privileges.

Description-Behavior Mismatch

High

Confidence: 96% confidence
Finding: The module presents itself as a deterministic calculator, but it also exposes arbitrary Python execution, shell execution, and file operations. This mismatch can mislead operators and downstream agents into granting trust or invoking capabilities far beyond the stated purpose, increasing the chance of unsafe use.

Intent-Code Divergence

High

Confidence: 100% confidence
Finding: `calculate` appears to be a safe math helper, but when `safe_eval` rejects input it falls back to `run_python(f"print({expression})")`. That converts malformed or malicious expressions into executable Python, so an input that looks like a calculation request can trigger arbitrary code execution.

Context-Inappropriate Capability

Critical

Confidence: 100% confidence
Finding: Arbitrary shell execution is unrelated to deterministic calculation and materially expands the attack surface. In a skill ecosystem, such hidden general-purpose execution capability can be abused for remote command execution, persistence, credential theft, and destructive system changes.

Context-Inappropriate Capability

High

Confidence: 99% confidence
Finding: Arbitrary Python execution is broader than the claimed calculator purpose and effectively gives callers a local code runner. That enables reading secrets, spawning processes, modifying the environment, and bypassing the protections of the safe evaluator.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: The read/write helpers provide filesystem access that is not necessary for a calculator skill. While file access alone is not always critical, in an agent setting it can expose secrets, overwrite application files, or facilitate follow-on attacks when combined with other capabilities.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The skill exposes shell execution without any built-in warning, confirmation, or policy gate. That makes accidental or prompt-induced dangerous command execution much more likely, particularly when used by an autonomous or semi-autonomous agent.

Missing User Warnings

Medium

Confidence: 86% confidence
Finding: Python code execution is made available without any confirmation or clear indication of its risk. In agent workflows, this increases the likelihood that untrusted input is executed directly, causing code execution and host compromise.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The script will automatically initialize a Git repository, stage all files, and create a commit if .git is absent, which modifies repository state without an explicit confirmation prompt. In a skill/publish helper context this can unexpectedly capture sensitive or irrelevant files into version control and create side effects the user did not clearly consent to.

VirusTotal

55/55 vendors flagged this skill as clean.

View on VirusTotal