Web Scraping Tool Selection Strategy

Security checks across malware telemetry and agentic risk

Overview

This is not malware, but it guides agents to use logged-in browser sessions and anti-scraping bypasses without enough consent, privacy, or scope safeguards.

Install only if you are comfortable with a skill that may steer your agent to use active browser logins for scraping. Use dedicated or test browser profiles, confirm you are authorized to collect the target data, avoid private or personal data unless explicitly permitted, and review platform rules before running the suggested commands.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
  • Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Findings (12)

Missing User Warnings

Medium
Confidence
94% confidence
Finding
The skill explicitly recommends reusing an authenticated browser session to access private or post-login data, but provides no warning about consent, privacy, account boundaries, or terms-of-service implications. In the context of a scraping-oriented skill, this materially increases the chance of unauthorized collection of personal or restricted data from an existing logged-in session.

Missing User Warnings

Medium
Confidence
96% confidence
Finding
The skill states that the browser tool can reuse logged-in Chrome state and bypass anti-scraping protections to obtain dynamic or login-visible data, without any legal, policy, or account-risk warning. That combination encourages evasion of platform safeguards and can facilitate unauthorized access or account enforcement actions.

Vague Triggers

Medium
Confidence
91% confidence
Finding
The eval set strongly steers the agent toward assisting with scraping of named commercial platforms while explicitly rewarding bypass-oriented behavior such as reusing logged-in browser state to get around anti-bot controls. Because the prompts are broad, high-confidence triggers with no exclusions for authorization, terms-of-service, privacy, or lawful-use boundaries, they can normalize and operationalize abusive collection workflows rather than merely discussing tooling at a high level.

Missing User Warnings

Medium
Confidence
92% confidence
Finding
The guidance explicitly promotes accessing logged-in pages and obtaining private account-visible data, but provides no privacy, authorization, or account-consent guardrails. In a web-scraping skill, this materially increases the chance that operators will collect personal or restricted data from active sessions without proper legal basis or user awareness.

Missing User Warnings

Medium
Confidence
91% confidence
Finding
The document explicitly recommends scraping user data from social platforms but provides no privacy, consent, or account-safety warning. In a scraping-focused skill, that omission increases the chance that users will collect personal or account-scoped data without understanding legal, privacy, or platform-policy implications.

Missing User Warnings

High
Confidence
96% confidence
Finding
The recommendation to use Playwright for login-state data and Chrome tab reuse implicitly encourages operating on authenticated browser sessions without any warning about exposing cookies, tokens, personal data, or privileged account actions. This is especially risky because reusing an existing logged-in browser context can access sensitive information far beyond public web content.

Missing User Warnings

Medium
Confidence
89% confidence
Finding
The text states that Playwright can be used to bypass anti-scraping defenses but omits any warning about compliance, platform restrictions, or operational risk. In this context, the omission can normalize evasive behavior and facilitate collection against site protections without user awareness of legal or policy consequences.

Ssd 3

High
Confidence
97% confidence
Finding
This instruction directly endorses using a reused authenticated browser context to scrape private or login-only data. In a scraping skill, that is especially dangerous because it operationalizes access to restricted data through an existing session, increasing the risk of privacy violations, unauthorized collection, and misuse of a user's credentials or account state.

Ssd 3

High
Confidence
97% confidence
Finding
The guidance encourages leveraging an already authenticated Chrome session to access login-visible data and frames this as a way to bypass anti-scraping defenses. That is a strong indicator of unsafe design because it lowers barriers to extracting restricted content using stored session trust rather than explicit re-authenticated authorization.

Ssd 3

Medium
Confidence
93% confidence
Finding
The pitfall section reinforces that users should prioritize reusing logged-in browser tabs to obtain post-login data, which normalizes unsafe authenticated scraping behavior. Even though phrased as operational advice, it still promotes collection of restricted data without guardrails for consent, necessity, or policy compliance.

Ssd 3

Medium
Confidence
90% confidence
Finding
Listing a pre-logged-in target-site account as a prerequisite bakes session reuse into the workflow and implicitly prepares the operator to access authenticated content. This is risky because it operationalizes scraping from live accounts without clarifying ownership, authorization, or privacy boundaries.

Ssd 3

High
Confidence
97% confidence
Finding
The document encourages reusing authenticated browser state to bypass login flows and anti-bot controls across multiple platforms, enabling access to private account-visible data through an already logged-in session. In the context of a scraping/tool-selection skill, this is more dangerous because it operationalizes session piggybacking at scale and normalizes collection from authenticated contexts that may exceed user authorization or violate platform protections.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal