Web Ui Test

Security checks across malware telemetry and agentic risk

Overview

This skill is mostly a browser UI testing helper, but it also guides agents through sensitive login, token, admin, deployment, and database-change workflows without enough guardrails.

Install only if you intend to let an agent drive visible browser sessions and assist with SSO/login testing. Treat this as a Review item: do not let it handle real passwords, 2FA codes, generated tokens, deployment commands, or database updates unless you have explicitly approved the exact action, verified the target environment, and have a rollback path.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (8)

Description-Behavior Mismatch

Medium

Confidence: 89% confidence
Finding: The skill is presented as a UI testing/verification tool, but it explicitly expands into operational account-administration tasks such as PAT issuance, OAuth approval, 2FA-related flows, and destructive UI actions. That broadens the authority of the skill into sensitive security workflows where an agent could steer users through privileged actions or normalize risky browser-mediated secret handling.

Description-Behavior Mismatch

Medium

Confidence: 96% confidence
Finding: This verification skill includes remediation steps that can change live infrastructure state, including deployment actions and direct database updates. Embedding write-capable operational fixes inside a testing guide increases the chance that an agent or user executes high-impact changes during what should be a read-only verification workflow, causing integrity or availability issues.

Context-Inappropriate Capability

Medium

Confidence: 97% confidence
Finding: The file directs use of ansible deploy commands, SSH, Docker exec, and SQL UPDATE statements even though the skill is presented as a UI-testing tool. This broadens the skill's authority from observation to privileged infrastructure manipulation, which is dangerous if invoked automatically or by a low-context operator.

Intent-Code Divergence

Medium

Confidence: 93% confidence
Finding: The document's stated purpose is SSO verification, but it later instructs users to apply fixes and alter database configuration. This mismatch can mislead agents and operators into performing privileged changes under the guise of testing, weakening safety boundaries and increasing the risk of unintended modifications.

Vague Triggers

Medium

Confidence: 85% confidence
Finding: The trigger phrases are broad, generic terms like 'UI check', 'browser test', and 'screen verify', which can cause the skill to activate in situations the user did not intend. Because the skill can open browsers, inspect pages, and guide sensitive auth flows, accidental invocation increases the chance of over-privileged actions or unintended browsing automation.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The skill directs users into authentication-sensitive browser flows such as sign-in, PAT creation, and permission granting without clear safeguards about credential exposure, secret leakage, session privacy, or what the agent can observe. In a browser-automation context, this can lead users to enter credentials or reveal generated tokens while believing the interaction is routine testing.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The instructions tell the operator to enter admin credentials into the login flow and even include a concrete admin username, but provide no warning or secret-handling guidance. In an agent context, this can encourage plaintext secret use, logging of credentials, reuse of high-privilege accounts, or accidental exposure through transcripts and snapshots.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The remediation section includes commands that can deploy changes and directly update the Authentik database, yet it lacks warnings about destructive effects, backup requirements, environment targeting, or rollback planning. In practice, an operator or agent could apply the wrong change to the wrong host and corrupt authentication configuration or disrupt access.

VirusTotal

60/60 vendors flagged this skill as clean.

View on VirusTotal