Morgana Mordred Security Sandbox

Security checks across malware telemetry and agentic risk

Overview

This is a disclosed educational security lab, but it can run real local commands and code and its safety boundaries are weaker than the packaging implies.

Install only if you intentionally want a hands-on defensive security lab. Run it in a disposable container, VM, or throwaway workspace with no sensitive files mounted, and do not grant it broad access to your normal home directory. Treat the vaccine files as educational examples, not production-ready security patches, and use the skill only on the included mock systems or targets you are clearly authorized to test.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Behavioral ASTexec() Call, eval() Call, Dynamic Import
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (17)

eval() call detected

High

Category: Dangerous Code Execution
Content: # This is FLAWED - should never use eval with user input try: # 模拟安全执行 (simulating safe execution - but it's NOT safe) result = eval(code) return {"safe": True, "result": result} except Exception as e: return {"safe": False, "error": str(e)}
Confidence: 99% confidence
Finding: result = eval(code)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: # Deliberately FLAWED - shell injection possible try: result = subprocess.run(cmd, shell=True, capture_output=True, text=True, timeout=5) return {"safe": True, "output": result.stdout, "error": result.stderr} except Exception as e: return {"safe": False, "error": str(e)}
Confidence: 99% confidence
Finding: result = subprocess.run(cmd, shell=True, capture_output=True, text=True, timeout=5)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 91% confidence
Finding: The skill declares powerful execution tools (`terminal`, `filesystem`) and includes commands for running Python scripts, cloning repositories, and invoking shell loops, yet no explicit permission model is documented in the metadata. For an agent skill centered on penetration testing and vulnerable systems, this capability mismatch increases the chance an agent is granted broad file/system access without adequate guardrails, enabling unintended code execution, file modification, or expansion into networked actions if the runner or scripts permit it.

Intent-Code Divergence

Low

Confidence: 99% confidence
Finding: The function returns sensitive administrative data regardless of the provided token, meaning any caller can read protected information without authentication. In an auth-related skill, this is a direct access control failure that exposes secrets immediately and trivially.

Intent-Code Divergence

Low

Confidence: 99% confidence
Finding: The function allows any caller to change the password of any listed user with no authentication, authorization, or session binding. This enables straightforward account takeover, including resetting privileged accounts such as admin.

Intent-Code Divergence

High

Confidence: 94% confidence
Finding: The module is described as a sandbox, but the implementation and comments explicitly acknowledge that it permits arbitrary execution and does not isolate anything. This mismatch is security-relevant because downstream users or agents may trust the interface and expose it to untrusted input under false assumptions of safety.

Intent-Code Divergence

High

Confidence: 97% confidence
Finding: execute_code returns "safe": True on successful eval() execution even though the function is explicitly unsafe and unrestricted. This misleading status can cause calling systems to treat dangerous execution as approved or sanitized, increasing the chance that untrusted code is accepted and propagated.

Intent-Code Divergence

High

Confidence: 97% confidence
Finding: run_command marks shell-based command execution as "safe": True despite the function's own documentation stating it is vulnerable. This creates dangerous trust signaling for callers, who may rely on the returned flag to permit or automate execution of untrusted commands.

Intent-Code Divergence

High

Confidence: 98% confidence
Finding: `filter_fields` bypasses all sensitivity checks when `allowed_fields` is provided and returns any requested key directly from the input record. That means a caller can explicitly request `ssn`, `credit_card`, `api_key`, or other secrets and receive them unredacted, defeating the stated purpose of the sanitizer and creating a direct information disclosure path.

Intent-Code Divergence

Medium

Confidence: 89% confidence
Finding: The file presents itself as a patch that exposes only safe public fields, but the `allow_sensitive=True` option re-enables return of sensitive and non-public data. In a security control, this kind of built-in bypass is dangerous because downstream developers may trust the documentation and instantiate it insecurely, leading to accidental leakage of secrets or regulated personal data.

Intent-Code Divergence

Medium

Confidence: 97% confidence
Finding: The file presents itself as a security patch while still shipping an intentionally vulnerable authentication function in the same module. This is dangerous because developers may import or call the unsafe path by mistake, reintroducing SQL injection and authentication bypass despite trusting the file as a remediation artifact.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The class is described as secure, but it exposes an unsafe authentication API that performs SQL query construction with f-strings. This mismatch increases the chance of accidental use by downstream code, enabling SQL injection and possible login bypass against the embedded database.

Intent-Code Divergence

High

Confidence: 99% confidence
Finding: The sandbox explicitly whitelists Python's built-in open(), while comments claim file access is limited to read-only, but no wrapper enforces mode restrictions. Any executed code that bypasses or avoids the simple substring check can use open to read or modify local files, defeating the sandbox's stated isolation guarantees.

Intent-Code Divergence

Medium

Confidence: 97% confidence
Finding: The code defines FORBIDDEN_MODULES but never actually uses it during execution, and instead blocks imports with brittle substring matching such as 'import ' and 'from '. Attackers can often evade text filters via encoding tricks, string construction, object graph traversal, or other Python introspection paths, so the claimed import protection is not reliably enforced.

Missing User Warnings

Medium

Confidence: 79% confidence
Finding: The runner executes local Python files without any interactive warning, confirmation, or trust verification, which can lead operators to run untrusted or modified test code unintentionally. In this skill's context, the executed files are explicitly security-testing programs, so compromise of the `systems` directory could directly result in arbitrary code execution under the runner's privileges.

Missing User Warnings

High

Confidence: 99% confidence
Finding: The main block prints the results of get_user_info("alexandre") and get_secret("api_key"), which directly exposes PII and credentials to any output consumer. In this file's context, the helper functions have no authentication or authorization, so the print statements demonstrate and operationalize immediate disclosure of highly sensitive data, making the skill more dangerous rather than less.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The file hard-codes a session token and sensitive admin data, then exposes that data through insecure functions. Hard-coded secrets are easily leaked through source access, logs, or reuse, and in this case the surrounding access-control flaws make exploitation even easier.

VirusTotal

VirusTotal findings are pending for this skill version.

View on VirusTotal