Kalshalyst

Security checks across malware telemetry and agentic risk

Overview

This looks like a real Kalshi trading tool, but it needs Review because it can trade and cancel orders despite being partly described as a scanner.

Install only if you intentionally want a full prediction-market trading stack, not just read-only market analysis. Start in dry-run, keep auto_trader_config disabled until you review risk limits, protect Kalshi and Anthropic credentials, inspect any sibling kalshi-command-center and prompt-lab code, and avoid Slack/webhook exports unless you are comfortable sharing trade details externally.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Behavioral ASTexec() Call, eval() Call, Dynamic Import
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (16)

Description-Behavior Mismatch

Medium

Confidence: 90% confidence
Finding: The README materially expands the skill from a scanner/alerting component into an autonomous trading and order-management system, including live execution and stale-order cancellation. This kind of scope mismatch is dangerous because users, orchestrators, or reviewers may grant permissions and deploy the skill under the assumption that it only analyzes markets, when the documented behavior includes placing and canceling real financial orders.

Intent-Code Divergence

Medium

Confidence: 88% confidence
Finding: The README presents the skill as self-executing and 'all hands-off,' which conflicts with a manifest-level framing that it only feeds another component for execution. This can mislead operators into enabling the skill in contexts with weaker controls or different trust assumptions, increasing the chance of unintended autonomous financial actions.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: The bug-fix protocol instructs the agent to edit source files, modify installed copies, and stage git commits, which is unrelated to the stated scanning function and creates a pathway for persistent code modification. In an agent setting, this increases the risk of unauthorized tampering, persistence, and supply-chain style changes under the guise of maintenance.

Description-Behavior Mismatch

High

Confidence: 98% confidence
Finding: This file implements autonomous order placement and stale-order cancellation, which materially exceeds the declared skill role of scanning, classifying, and alerting on market opportunities. That capability expansion is dangerous because a user expecting analysis-only behavior could unknowingly grant a component live trading authority over funds.

Context-Inappropriate Capability

Medium

Confidence: 88% confidence
Finding: The script reads a Slack webhook secret from environment/config and transmits trading-related messages externally, but that capability is not justified by the scanner-focused description. In a financial trading context, silent outbound notifications can expose sensitive positions, tickers, and activity metadata to third parties without the user's clear awareness.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The code imports and relies on a sibling execution skill, effectively inheriting trading capabilities while the manifest frames this component as feeding edge data to that separate system. This hidden cross-skill coupling undermines least privilege and can cause users to trust an analysis skill that actually executes or enables live trades.

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: The file’s documented purpose and features materially expand from a market-scanning/edge-alerting skill into portfolio monitoring, trade logging, trade resolution, and export of execution data. In an agent-skill setting, this scope creep is security-relevant because it increases access to sensitive brokerage/account data and enables persistence of trading history beyond what users would reasonably expect from a scanner.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: This code queries live Kalshi portfolio state, including balances, positions, and open orders, which goes beyond producing scanner edge data. In the context of a prediction-market scanner, accessing execution/account state broadens the blast radius to sensitive financial information and creates an unnecessary trust boundary crossing.

Context-Inappropriate Capability

Medium

Confidence: 96% confidence
Finding: The code directly retrieves cash balance, positions, and orders from a live trading account even though the skill is described as a scanner. That mismatch makes the behavior more dangerous because users may supply credentials expecting analysis-only behavior, while the skill silently gains visibility into sensitive financial state.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The user-facing market scope text claims the hockey model 'beats market baseline,' while the module header explicitly says corrected evaluation found hockey is effectively a wash. In a trading skill, this kind of performance misrepresentation can cause users or downstream agents to over-trust a model and allocate capital based on false assumptions, making it a genuine security/integrity issue rather than a harmless documentation mistake.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The premium short scope text repeats the unsupported claim that the hockey model beats the market baseline, contradicting the file's own corrected evaluation notes. Because this skill is explicitly used for prediction-market scanning and feeds downstream trading decisions, misleading efficacy claims can directly influence automated or human trading behavior.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: This document gives highly actionable betting and bankroll-sizing guidance, including formulas, examples, risk settings, and tuning recommendations, but does not present a clear user-facing warning about financial loss, model error, or the speculative nature of prediction markets. In the context of a trading skill that converts model estimates into position sizes, omission of risk disclosures can encourage overreliance on the system and lead users to commit capital without understanding downside, drawdowns, calibration error, or regulatory considerations.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: Slack notifications include trading activity such as side, ticker, contracts, and cost, yet this file provides no user-facing warning or confirmation before transmitting that information off-host. In a trading environment, those details can reveal strategy, positions, and timing to external services, creating privacy and operational risk.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: Portfolio snapshots containing cash balances, positions, and timestamps are automatically appended to a local JSONL file without an explicit user warning at the point of collection. Persistent storage of sensitive financial data increases exposure to local compromise, accidental disclosure, backups/sync leakage, and secondary use by other tools or users on the host.

Natural-Language Policy Violations

Medium

Confidence: 93% confidence
Finding: This is a true issue: the description makes an unsupported quantitative-performance claim about the hockey model. In the context of a market-trading skill, unsupported claims are especially risky because they can distort operator trust, system routing decisions, and position sizing, even if no code execution bug is present.

Natural-Language Policy Violations

Medium

Confidence: 94% confidence
Finding: The premium short description is misleading because it asserts the hockey model beats baseline despite the corrected results indicating no demonstrated edge. In this domain, concise summary text is likely to be surfaced in dashboards or alerts, so even a short misleading string can materially bias user behavior and create financial risk.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal