Payclaw Badge Pub

Security checks across malware telemetry and agentic risk

Overview

The skill is mostly an identity-badge integration, but its commerce/security claims and telemetry behavior do not consistently match the code and docs.

Review before installing. Use this only if you are comfortable giving PayClaw an API key and sharing merchant identity/trip outcome data. Treat the badge-presentation reporting and automatic check-back claims cautiously unless the publisher fixes the wiring or clarifies that they are not active in this version; pin the npm package version you reviewed.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (7)

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: The README describes behavior that goes beyond a transient identity declaration: it persists a Consent Key for future use and includes a tool to report when a badge was presented to a merchant. This creates a mismatch between the apparent scope of the skill and its documented capabilities, which can undermine informed user consent, expand data retention, and introduce tracking or authorization persistence risks if users expect a one-time credential exchange.

Description-Behavior Mismatch

High

Confidence: 99% confidence
Finding: The tool explicitly claims to record badge presentation with PayClaw, but the implementation only checks for an API key and returns a success message without performing any network call or persistence. This creates a deceptive security control: downstream agents or users may rely on a badge-reporting workflow that never actually occurs, leading to false assurance, audit gaps, and unsafe merchant interactions.

Intent-Code Divergence

High

Confidence: 98% confidence
Finding: The documentation promises that PayClaw will check back after badge presentation, but no code triggers any follow-up workflow, scheduling, callback registration, or remote API request. In a security-sensitive identity/assertion product, this mismatch is dangerous because it can mislead agents and merchants into believing there is post-presentation verification or monitoring when none exists.

Description-Behavior Mismatch

Medium

Confidence: 97% confidence
Finding: This code performs post-presentation surveillance of the agent by asking follow-up questions about merchant behavior and then reporting inferred outcomes to the PayClaw API. That materially exceeds the stated purpose of merely presenting a credential, creating an undisclosed telemetry channel about agent activity and outcomes.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: The skill uses MCP sampling to interrogate the agent after badge presentation, which is a monitoring capability rather than a credential-presentation function. Because the sampled answer is then used to classify and report success or denial, this creates covert behavioral monitoring that an integrator may not expect from the described skill.

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: The tool is described as an identity-only badge, but its formatted output tells the agent/user that spend is available and explicitly instructs calling a payment function. That creates capability confusion and can mislead an agent into attempting or presenting payment actions beyond the expected trust boundary of this skill, which is especially risky in ecommerce flows where users and merchants may infer broader authorization than intended.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The code transmits merchant identifiers, verification tokens, outcome classifications, and a truncated detail field derived from the agent response to an external API. Sending interaction details off-platform without a clear warning or consent path creates privacy and data-governance risk, especially because the detail may contain more information than intended.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal