Fraud Filter

Security checks across malware telemetry and agentic risk

Overview

This skill matches its fraud-filtering purpose, but it controls payment decisions and report submission through broad local and remote controls that need review before installation.

Install only if you are comfortable with a skill that can monitor and sometimes block payment-like tool calls. Keep the dashboard stopped when not needed, review queued reports before enabling network participation or flushing, avoid custom trust/report URLs, and prefer a version with authenticated localhost write APIs, URL allowlisting, signed or validated remote feeds, consistent sync_hotlist behavior, and clearer reporting disclosures.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (14)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 88% confidence
Finding: The skill advertises shell, network, and environment-dependent behavior but does not declare permissions. That creates a transparency and policy-enforcement gap: operators may approve the skill without realizing it can make network requests, run local commands, and access environment-derived context, increasing the chance of unintended data exposure or execution in sensitive environments.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 95% confidence
Finding: The documented purpose is narrowly framed as endpoint reputation checks and local queuing, but the skill also exposes a local dashboard/API, performs remote sync behavior, supports optional report submission, and manages configuration. This mismatch is dangerous because users and review systems may grant trust based on a narrower threat model than the skill actually requires, especially when the dashboard/API could expose trust data and queued reports locally.

Intent-Code Divergence

Medium

Confidence: 97% confidence
Finding: The documentation is internally inconsistent: it says reports are only queued locally and network reporting is opt-in, but elsewhere says failures are reported back to the network automatically. For a tool handling transaction outcome data, ambiguity about whether data leaves the machine can cause unintentional disclosure and prevent informed consent by users.

Description-Behavior Mismatch

Medium

Confidence: 92% confidence
Finding: The hook automatically queues a failure report whenever a payment-related tool returns an empty, null, or error-like response, with no visible consent check in this file. Even if queueReport is only local, this still creates security and privacy risk because sensitive payment endpoint metadata is collected by default despite the stated behavior that network reporting is opt-in, and queued data may later be transmitted or processed unexpectedly.

Description-Behavior Mismatch

Medium

Confidence: 94% confidence
Finding: The documentation explicitly states that anonymous outcome signals are sent to a remote endpoint, which conflicts with the stated product behavior that reporting is opt-in and locally queued by default. This kind of documentation mismatch is security-relevant because implementers or reviewers may enable network telemetry by default, causing unintended data transmission and weakening user consent expectations.

Intent-Code Divergence

Medium

Confidence: 89% confidence
Finding: The script documentation states it downloads from the configured CDN, but the implementation also accepts an arbitrary caller-supplied --url and writes the fetched JSON into the local trust database. In this skill's context, that can let a local caller replace the reputation dataset with attacker-controlled content, undermining payment endpoint trust decisions and potentially causing false approvals or denials.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The code introduces hourly remote fetching of a server-controlled hotlist, which materially changes behavior from a local-only, opt-in reporting/trust model to one that depends on external network-delivered policy. In a payment-filtering context, that means a remote service or any party able to influence it can silently alter which endpoints are blocked, creating both privacy and availability concerns and undermining operator expectations set by the skill description.

Intent-Code Divergence

Low

Confidence: 78% confidence
Finding: The comments downplay side effects by suggesting the cached file is simply read directly, while this module actively performs network synchronization and local writes. That mismatch is dangerous because reviewers and operators may underestimate outbound connectivity and remote policy influence, reducing informed consent and making security review less effective.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The plugin performs outbound network synchronization by default at startup and hourly thereafter, which conflicts with the stated behavior that network reporting is opt-in. Even if only a hotlist is downloaded, undisclosed default network activity expands the trust boundary, can surprise users in restricted environments, and undermines informed consent about when the skill contacts external services.

Missing User Warnings

Medium

Confidence: 87% confidence
Finding: The hook performs automatic reporting logic after payment-tool execution without any user-facing warning or confirmation in the hook itself. In the context of a payment-related skill, silently collecting and queueing endpoint URLs associated with failed payment attempts is more sensitive than in a generic utility, because it can expose merchant, wallet, or transaction-target metadata and create non-obvious telemetry.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The documentation describes sending endpoint hashes and a stable reporter hash to a remote service without an explicit privacy warning, consent explanation, or discussion of re-identification risk. Even when hashed, these values can still function as persistent pseudonymous identifiers and may reveal behavioral patterns about installations and payment usage.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The report submission code sends user-supplied endpoint URL, outcome, and amount to `/api/reports` immediately when the user clicks "Queue Report," but the UI does not present a clear just-in-time disclosure of exactly what data will be stored or later transmitted. In a payment-reputation skill, endpoint and transaction metadata can be sensitive, so collecting and forwarding it without an explicit send-time warning or confirmation creates a privacy and data-sharing risk.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The plugin queues reports containing full payment endpoint URLs, which may reveal sensitive business relationships, tenant identifiers, or transaction-specific paths. In this skill's context, payment-related endpoints are especially sensitive metadata, and collecting them without clear disclosure or consent creates a privacy and data-handling risk, particularly if the queued reports are later transmitted when reporting is enabled.

External Transmission

Medium

Category: Data Exfiltration
Content: ### Anonymous outcome signal Sent to `POST https://api.fraud-filter.com/reports`: ```json {
Confidence: 88% confidence
Finding: https://api.fraud-filter.com/

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal