Playwright Scraper CN

Security checks across malware telemetry and agentic risk

Overview

This is mostly a disclosed Playwright scraping skill, but it also includes under-documented Xianyu/Goofish login automation with a hardcoded phone number and SMS-code triggering behavior.

Review carefully before installing. Use only on sites you are authorized to access and scrape, remove or ignore the Xianyu scripts unless you specifically control that account workflow, delete the hardcoded phone number, avoid SMS-login automation, and treat saved screenshots or HTML as potentially sensitive data. Prefer an isolated environment and regenerate dependencies from trusted HTTPS sources.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration

Findings (19)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 80% confidence
Finding: The skill documentation describes scripts that use network access and environment variables, but the skill does not declare corresponding permissions. This weakens transparency and reviewability, making it easier for users or platforms to underestimate what the skill can access and do at runtime.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 92% confidence
Finding: A description-behavior mismatch is security-relevant because it can conceal materially different capabilities from users and reviewers. If the underlying skill includes login automation, phone-number entry, SMS-code initiation, or targeted support for third-party account workflows, that goes well beyond generic scraping and increases the risk of credential abuse, unauthorized access attempts, and deceptive deployment.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: This script contains deliberate anti-detection behavior: it disables automation indicators, spoofs browser traits, and explicitly waits through Cloudflare challenge pages. In a scraping skill, that materially increases the capability to bypass site defenses and access content under conditions the site is attempting to restrict, which is risky even if framed as reliability or compatibility work.

Description-Behavior Mismatch

High

Confidence: 97% confidence
Finding: This script automates a real account login flow, initiates SMS-based authentication, and includes anti-detection logic to mask browser automation. That materially exceeds a normal scraping helper and creates a capability for unauthorized account access attempts or misuse of a real user's identity, especially because the skill is presented as a scraping tool rather than an authenticated account-management tool.

Context-Inappropriate Capability

High

Confidence: 99% confidence
Finding: A hardcoded personal phone number is embedded directly in the script and then used for account authentication. This exposes sensitive personal data, ties the automation to a specific real account, and enables accidental or intentional triggering of login flows against that person's account without any runtime consent or secret handling protections.

Description-Behavior Mismatch

Medium

Confidence: 97% confidence
Finding: This script is explicitly tailored to evade bot detection on Xianyu/Goofish rather than act as a neutral scraper. It disables browser security features, spoofs automation fingerprints, and mimics human behavior, which materially increases its usefulness for unauthorized scraping, account abuse, or bypassing platform protections.

Context-Inappropriate Capability

Medium

Confidence: 93% confidence
Finding: The manual-login workflow facilitates using a real user session inside an anti-detection crawler, enabling scraping or actions behind authentication barriers. In this context, that makes the tool more dangerous because it can harvest authenticated content or operate with a human-provided account while avoiding detection.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: The README explicitly promotes stealth scraping and anti-bot evasion techniques such as hiding automation and simulating human behavior, but it does not warn users about terms-of-service violations, legal restrictions, or account/IP blocking risks. In a scraping skill, this omission materially increases the chance that users will deploy the capability in unsafe or noncompliant ways, especially because the feature is presented as recommended and successful against protected sites.

Missing User Warnings

Medium

Confidence: 84% confidence
Finding: The README instructs users to save screenshots and full HTML from scraped pages without warning that captured content may include personal data, session-specific data, or other sensitive information. Because this is documentation for a scraping tool, encouraging indiscriminate artifact capture can lead to unintended collection, retention, and exposure of regulated or confidential data.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The README explicitly promotes stealth scraping and anti-bot evasion techniques, including bypass-oriented behavior, but does not warn users about legal, policy, privacy, or site-integrity risks. In a scraping skill, normalizing anti-bot evasion without safeguards increases the chance of misuse against protected services and encourages operation outside expected authorization boundaries.

Missing User Warnings

Medium

Confidence: 86% confidence
Finding: The documentation encourages saving screenshots and full HTML locally but does not warn that these artifacts can contain sensitive data such as session state rendered into pages, personal information, or proprietary content. This can lead to unintended local data retention, leakage through logs or shared directories, and mishandling of scraped content.

Missing User Warnings

Medium

Confidence: 84% confidence
Finding: The documentation advertises screenshot and HTML saving but does not warn that scraped content may be written to local disk. This can lead to accidental storage of sensitive page data, session state, personal information, or proprietary content in files that persist beyond the scraping session.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The skill promotes stealth techniques specifically intended to evade anti-bot controls without warning users about compliance, account, or access risks. In context, this makes the skill more dangerous because it normalizes bypass behavior that can violate site terms, trigger account bans, or facilitate unauthorized scraping at scale.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The documentation explicitly promotes stealth scraping of anti-bot protected sites, saving screenshots, and saving full HTML without any warning about privacy, terms-of-service, or sensitive-data capture. In a scraping skill, this materially increases the likelihood of misuse because users are guided toward bypassing defenses and collecting potentially sensitive page content and visual data by default.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The script contains a real phone number and logs it after filling the input, which unnecessarily propagates sensitive personal data into console output and potentially into centralized logs. In a shared agent or CI environment, this increases privacy risk and can expose identifiers to operators or other systems beyond the intended user.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The script extracts page text and prints it to logs after beginning a login flow, which may capture account-related prompts, personally identifiable information, or other sensitive site content. In the context of an automated skill, indiscriminate logging of page contents can leak user/session data to logs, transcripts, or downstream systems.

Vague Triggers

Medium

Confidence: 83% confidence
Finding: The trigger list is broad enough to activate on generic scraping-related requests without any embedded limitation to authorized, consent-based, or policy-compliant use. In a skill explicitly designed for web scraping, this increases the chance the agent is invoked for questionable collection tasks against third-party sites, expanding misuse risk.

Natural-Language Policy Violations

High

Confidence: 96% confidence
Finding: The description and feature list explicitly advertise bypassing anti-bot protections, which is a strong signal the skill is intended to circumvent access controls rather than operate within normal site policies. In the context of an automation/scraping skill, this meaningfully elevates abuse potential, including unauthorized harvesting, ToS evasion, and automated access against protected targets.

Known Vulnerable Dependency: playwright==1.40.0 — 1 advisory(ies): CVE-2025-59288 (Playwright downloads and installs browsers without verifying the authenticity of)

High

Category: Supply Chain
Confidence: 96% confidence
Finding: playwright==1.40.0

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal