playwright-pro

Security checks across malware telemetry and agentic risk

Overview

This Playwright testing skill is mostly coherent, but it gives agents broad test-running, repository-changing, third-party upload, and data-mutation abilities without enough user-facing guardrails.

Install only if you are comfortable with an agent running Playwright commands, editing test and CI files, and using optional third-party services. Before using it, configure it for isolated test/staging environments, use disposable test accounts and least-privilege tokens, keep auth state such as `.auth/user.json` out of source control and CI artifacts, and require manual review before Slack/TestRail/BrowserStack uploads or any delete/update/account/role-changing tests.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (27)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 94% confidence
Finding: The skill advertises capabilities that involve environment variables and outbound network access, but it does not declare permissions or clearly scope when those capabilities are used. In an agent setting, this can lead to unexpected secret access or remote data transmission, especially because the skill references API-key-based integrations with TestRail and BrowserStack.

Context-Inappropriate Capability

Medium

Confidence: 93% confidence
Finding: The template is presented as an empty-state/search test, but it includes a real destructive deletion flow that clicks a delete button and confirms the action. In a reusable agent skill, this broadens behavior beyond the stated purpose and can cause unintended data loss if generated tests are run against shared, staging, or production-like environments with real data.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The skill explicitly offers syncing with TestRail and BrowserStack, which implies sending test artifacts, metadata, or project information to external services, yet it provides no warning at the point of use about third-party transmission. In a testing context, those artifacts may contain URLs, test names, traces, screenshots, or other sensitive internal information, making silent exfiltration to SaaS platforms a meaningful privacy and compliance risk.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The documentation demonstrates saving authenticated browser session state to `.auth/user.json` and reusing it, but does not warn that this file contains sensitive session material and must be protected from source control, artifact upload, and sharing. In a testing toolkit skill, users are likely to copy this pattern directly into real CI/CD environments, which increases the chance of credential or session-token exposure if the file is mishandled.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The skill directs users to run Playwright tests on BrowserStack, a third-party cloud service, but does not warn that test traffic, credentials in requests, logs, screenshots, videos, and other artifacts may be transmitted off-premises. In testing contexts, those artifacts often contain sensitive application data or internal environments, so omission of a clear disclosure can lead to unintentional data exposure and compliance issues.

Vague Triggers

Medium

Confidence: 92% confidence
Finding: The skill’s trigger phrases include broad, everyday language such as "what needs testing," which can match many routine conversations beyond the intended narrow use case. This can cause unintended invocation of the coverage skill, leading the agent to perform irrelevant repository scanning or recommend actions the user did not explicitly request.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The skill instructs the agent to execute multiple shell commands against a user-supplied test file path without any safety gating, confirmation, or disclosure. In an agentic environment, this can lead to unintended command execution, resource-intensive runs, or misuse if placeholders like <file> are populated from untrusted input or if commands are executed automatically in sensitive repositories or CI-like environments.

Missing User Warnings

Low

Confidence: 89% confidence
Finding: The skill explicitly instructs the agent to run `npx playwright test <generated-file>` without requiring confirmation or warning that this may drive a real browser against whatever `baseURL`, authenticated state, or connected environment the local project is configured to use. In a testing skill, execution is expected, but absent guardrails this can still trigger unintended actions against local dev, staging, or even production-like targets if the configuration is unsafe.

Vague Triggers

Medium

Confidence: 90% confidence
Finding: The trigger phrases are broad enough that the skill may activate for generic requests like 'testing setup' or 'add test infrastructure' that do not specifically ask for Playwright. That can cause the agent to take Playwright-specific actions and modify a repository unexpectedly, increasing the chance of mis-scoped automation and unintended changes.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The skill instructs the agent to install packages, generate config, create tests, modify CI, update .gitignore, and edit package.json, but does not require an explicit warning or confirmation before changing repository files. In an agent setting, this is risky because a loosely invoked skill could make broad, persistent project changes without the user clearly understanding the scope.

Vague Triggers

Medium

Confidence: 94% confidence
Finding: The trigger phrases include broad terms such as "migrate tests," "convert tests," and "replace selenium," which can match common user requests beyond a narrowly scoped migration task. This can cause unintended invocation of the skill, leading the agent to perform repository scanning, setup, and test-conversion actions in contexts where the user did not explicitly request this workflow.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The skill automatically posts test summaries and failed test details to Slack when a webhook is present, without requiring explicit user confirmation or warning about external data transmission. Test failures often include stack traces, file paths, environment details, or sensitive application data, so this can unintentionally exfiltrate internal information to an external service.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The skill instructs the agent to push Playwright results and failure messages to TestRail, which is an external service, without requiring explicit user confirmation or warning that execution metadata and possibly sensitive error details will be transmitted. Test failures often contain stack traces, URLs, usernames, internal identifiers, or fragments of application data, so silent upload can cause unintended data disclosure.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The skill allows local Playwright test content to overwrite remote TestRail case data via `testrail_update_case` without warning the user that authoritative test-management records will be modified. This can cause accidental corruption of shared QA artifacts, propagate incorrect steps or expectations, and affect team workflows if updates are made automatically from stale or misparsed local tests.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The template instructs users to supply real API tokens, refresh tokens, API keys, and session cookies and then sends them to configured endpoints, but it provides no warning about using test-only credentials, secure secret storage, or the risk of hitting production systems. In a reusable testing skill, this can lead to accidental exposure of sensitive credentials in source control, CI logs, shared fixtures, or live-environment requests.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The template includes update and delete GraphQL mutation examples that can modify or remove real data, but it does not warn users to run them only against test environments or disposable fixtures. In a production-grade testing skill, this omission can lead to accidental destructive execution against staging or production systems, especially because placeholders like existingEntityId and deletableEntityId make the operations look ready-to-use.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: This template explicitly includes POST, PUT, PATCH, and DELETE operations against a user-supplied API base URL and entity endpoint, but provides no warning that running it can modify or permanently remove real data. In a testing toolkit, users may paste production endpoints or real IDs, so the omission materially increases the chance of accidental destructive actions.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The template requires a bearer token and sends authenticated requests, yet it gives no guidance on secure credential handling, least-privilege tokens, or avoiding production secrets. This is risky because users may hardcode sensitive tokens in test files, logs, CI variables, or run the template against privileged environments.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The template explicitly requires real login credentials and references inspection of a live session cookie, but provides no warning or guidance about using test-only accounts, secrets management, or avoiding production data. In a reusable agent skill, this can normalize unsafe handling of authentication secrets and session artifacts, increasing the chance users paste real credentials into prompts, source files, logs, or CI systems.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The template explicitly instructs users to provide highly sensitive MFA materials, including a TOTP secret, backup code, username, and password, but gives no warning about secure handling, storage, redaction, or test-environment isolation. In a production-grade testing skill, this increases the chance that real secrets are copied into source files, logs, CI variables, screenshots, or shared test artifacts, which could enable account takeover if exposed.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The template includes an actual destructive delete flow against application resources with no warning, isolation guidance, or recommendation to use test-only fixtures. In a testing skill, users may paste this into environments connected to shared or non-ephemeral data, causing unintended data loss during routine test execution or CI runs.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The role-elevation example changes a user's permissions to admin without any warning that it mutates authorization state and could persist beyond the test. In real environments this can accidentally grant elevated privileges to real accounts, skew later tests, or create a temporary privilege-escalation condition if cleanup is missing or the test aborts mid-run.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The template explicitly instructs use of a valid account username and password in Playwright tests, but provides no warning or guidance about using dedicated test accounts, secrets management, or avoiding production credentials. In a testing toolkit, this can lead users to hardcode or paste real credentials into test files, CI logs, screenshots, or repository history, increasing the risk of credential leakage and unauthorized access.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: This template automates live registration flows and explicitly uses unique emails, which can create persistent accounts and potentially trigger outbound verification emails without warning the operator. In a production-grade browser automation skill, this is risky because users may run the template against real environments, causing unintended account creation, mailbox spam, polluted test data, and possible rate-limit or compliance issues.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The markdown describes harmless empty-state behavior, but the included test silently performs a delete-and-confirm sequence without warning the user that the template may mutate application data. This mismatch increases the chance that operators will trust and run the template in contexts where destructive actions are unsafe, leading to accidental deletion of records.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal