Backtest Poller

Security checks across malware telemetry and agentic risk

Overview

The skill mostly does what it says, but its background daemon includes an unsafe notification command path that could run local shell commands from crafted backtest text.

Review before installing with real QuantConnect credentials. Use a limited token if possible, choose drawdown thresholds carefully because early-stop permanently deletes backtests, avoid untrusted or shell-like characters in backtest names until the notification code is fixed, and disable auto-diagnosis unless you trust the imported forensics module.

SkillSpector

By NVIDIA

Vulnerability Patterns

Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Behavioral ASTexec() Call, eval() Call, Dynamic Import
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (8)

os.system() or os exec-family call

High

Category: Dangerous Code Execution
Content: try: title = "Backtest Poller" message = f"Backtest {bt.name} finished ({bt.status})" os.system( f"""osascript -e 'display notification "{message}" with title "{title}"'""" ) logger.info(f"Notification sent: {message}")
Confidence: 98% confidence
Finding: os.system( f"""osascript -e 'display notification "{message}" with title "{title}"'""" )

Dynamic import via import()

Medium

Category: Dangerous Code Execution
Content: "forensics", ]: try: mod = __import__(module_path, fromlist=["OrderForensics"]) OrderForensics = getattr(mod, "OrderForensics", None) if OrderForensics: break
Confidence: 81% confidence
Finding: mod = __import__(module_path, fromlist=["OrderForensics"])

Lp3

Medium

Category: MCP Least Privilege
Confidence: 91% confidence
Finding: The skill declares required environment variables and binaries in metadata but does not declare explicit permissions despite clearly describing capabilities that read secrets from the environment, write persistent state/results, access the network, and launch shell/background processes. This is dangerous because users and any permission-gating system may underestimate the effective trust boundary of the skill, especially since it runs as a nohup daemon and can continue operating after the terminal disconnects.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 94% confidence
Finding: The documented behavior goes beyond passive monitoring: it can submit jobs, delete backtests irreversibly, run a background daemon, invoke macOS notifications via osascript, and dynamically execute diagnosis logic from the Python path. A description-behavior mismatch is security-relevant because operators may approve a 'monitor' skill without realizing it has destructive API actions and code-execution/extensibility surfaces, increasing the chance of unintended data loss or abuse.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: Executing host shell commands is broader capability than needed and increases attack surface, especially when combined with interpolated backtest data. In this specific file, the shell use is not merely unnecessary design-wise; it compounds into a command-injection path in a background monitoring daemon.

Description-Behavior Mismatch

Medium

Confidence: 90% confidence
Finding: The client exposes a destructive `delete_backtest` capability even though the skill is framed primarily as a monitoring/polling daemon. In an agent setting, adding unnecessary mutation/abort functionality increases the blast radius: if the skill is misused, prompted incorrectly, or compromised, it can terminate user backtests rather than merely observe them.

Unpinned Dependencies

Low

Category: Supply Chain
Content: requests>=2.28.0 python-dotenv>=1.0.0
Confidence: 95% confidence
Finding: requests>=2.28.0

Unpinned Dependencies

Low

Category: Supply Chain
Content: requests>=2.28.0 python-dotenv>=1.0.0
Confidence: 94% confidence
Finding: python-dotenv>=1.0.0

VirusTotal

62/62 vendors flagged this skill as clean.

View on VirusTotal

Overview

SkillSpector

os.system() or os exec-family call

Dynamic import via __import__()

Lp3

Tp4

Context-Inappropriate Capability

Description-Behavior Mismatch

Unpinned Dependencies

Unpinned Dependencies

VirusTotal

Dynamic import via import()