deterministic-calc-skill

Security checks across malware telemetry and agentic risk

Overview

This calculator skill includes useful math helpers, but it also gives the agent broad local code, shell, and file access that needs manual review before installation.

Treat this as a high-authority local execution helper, not just a calculator. Install only if you intend to let the agent run local Python, shell commands, and arbitrary file reads/writes; prefer safe_eval() for untrusted math and isolate or disable calculate(), run_python(), run_shell(), and write_file() unless you have a real sandbox and explicit approval workflow.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Behavioral ASTexec() Call, eval() Call, Dynamic Import
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Findings (15)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
dict: {"success": bool, "stdout": str, "stderr": str, "exit_code": int}
    """
    try:
        result = subprocess.run(
            ["python3", "-c", code],
            capture_output=True,
            text=True,
Confidence
98% confidence
Finding
result = subprocess.run( ["python3", "-c", code], capture_output=True, text=True, timeout=timeout )

subprocess module call

Medium
Category
Dangerous Code Execution
Content
dict: {"success": bool, "stdout": str, "stderr": str, "exit_code": int}
    """
    try:
        result = subprocess.run(
            command,
            shell=shell,
            capture_output=True,
Confidence
100% confidence
Finding
result = subprocess.run( command, shell=shell, capture_output=True, text=True, timeout=timeout )

Description-Behavior Mismatch

High
Confidence
96% confidence
Finding
The README markets the package as a deterministic calculation skill, but it documents capabilities for arbitrary Python execution, shell command execution, and file access. This mismatch can cause integrators or agents to trust and enable a much broader tool than intended, creating command-execution and data-access risk if untrusted input reaches those functions.

Context-Inappropriate Capability

High
Confidence
98% confidence
Finding
Arbitrary shell execution is highly dangerous in a skill framed as a calculator because consumers may not expect operating-system command execution. If attacker-controlled or model-generated input is passed to run_shell, it can execute destructive commands, exfiltrate data, or alter the host environment.

Context-Inappropriate Capability

High
Confidence
97% confidence
Finding
Arbitrary Python execution exceeds the stated deterministic-calculation purpose and enables full code execution on the host. In an agent context, model-produced or user-supplied code could read secrets, modify files, make network requests, or chain into broader compromise.

Context-Inappropriate Capability

High
Confidence
98% confidence
Finding
The skill is presented as a deterministic calculator, but it exposes a generic `run_shell(command)` capability that can execute arbitrary system commands. This greatly expands the attack surface beyond the stated purpose and could enable filesystem access, data exfiltration, environment discovery, or destructive command execution if an agent passes through untrusted input.

Context-Inappropriate Capability

High
Confidence
97% confidence
Finding
Exposing `run_python(code)` allows arbitrary code execution, which is not necessary for a deterministic calculator and undermines the safety boundary implied by the skill's purpose. An attacker or prompt-injected workflow could use this to read local files, access secrets, make network calls, or execute arbitrary logic under the agent's privileges.

Description-Behavior Mismatch

High
Confidence
96% confidence
Finding
The module presents itself as a deterministic calculator, but it also exposes arbitrary Python execution, shell execution, and file operations. This mismatch can mislead operators and downstream agents into granting trust or invoking capabilities far beyond the stated purpose, increasing the chance of unsafe use.

Intent-Code Divergence

High
Confidence
100% confidence
Finding
`calculate` appears to be a safe math helper, but when `safe_eval` rejects input it falls back to `run_python(f"print({expression})")`. That converts malformed or malicious expressions into executable Python, so an input that looks like a calculation request can trigger arbitrary code execution.

Context-Inappropriate Capability

Critical
Confidence
100% confidence
Finding
Arbitrary shell execution is unrelated to deterministic calculation and materially expands the attack surface. In a skill ecosystem, such hidden general-purpose execution capability can be abused for remote command execution, persistence, credential theft, and destructive system changes.

Context-Inappropriate Capability

High
Confidence
99% confidence
Finding
Arbitrary Python execution is broader than the claimed calculator purpose and effectively gives callers a local code runner. That enables reading secrets, spawning processes, modifying the environment, and bypassing the protections of the safe evaluator.

Context-Inappropriate Capability

Medium
Confidence
91% confidence
Finding
The read/write helpers provide filesystem access that is not necessary for a calculator skill. While file access alone is not always critical, in an agent setting it can expose secrets, overwrite application files, or facilitate follow-on attacks when combined with other capabilities.

Missing User Warnings

Medium
Confidence
88% confidence
Finding
The skill exposes shell execution without any built-in warning, confirmation, or policy gate. That makes accidental or prompt-induced dangerous command execution much more likely, particularly when used by an autonomous or semi-autonomous agent.

Missing User Warnings

Medium
Confidence
86% confidence
Finding
Python code execution is made available without any confirmation or clear indication of its risk. In agent workflows, this increases the likelihood that untrusted input is executed directly, causing code execution and host compromise.

Missing User Warnings

Medium
Confidence
91% confidence
Finding
The script will automatically initialize a Git repository, stage all files, and create a commit if .git is absent, which modifies repository state without an explicit confirmation prompt. In a skill/publish helper context this can unexpectedly capture sensitive or irrelevant files into version control and create side effects the user did not clearly consent to.

VirusTotal

55/55 vendors flagged this skill as clean.

View on VirusTotal