Browser Collector

Security checks across malware telemetry and agentic risk

Overview

This is a disclosed browser-scraping toolkit, but it combines stealth, CAPTCHA solving, public proxy rotation, batch collection, and persistent session cookies in ways that need careful review before use.

Install only if you intentionally need an authorized browser automation or scraping framework. Use an isolated environment and throwaway accounts, avoid personal or work browser sessions, disable cookie persistence/export, public proxies, stealth, and CAPTCHA solving unless explicitly authorized, and set strict domain allowlists with conservative worker limits.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (24)

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: The plan explicitly introduces a Playwright stealth module for fingerprint hiding, webdriver concealment, and progressively more aggressive anti-detection behavior. In the context of a generic browser collection skill, these capabilities materially enable covert scraping and evasion of site defenses rather than normal automation, making misuse substantially easier.

Context-Inappropriate Capability

High

Confidence: 97% confidence
Finding: The CAPTCHA and slider-solving modules are explicit bypass capabilities, including OCR solving and generation of human-like slider tracks. For an unknown-purpose collection skill, this is dangerous because it directly facilitates defeating access controls and anti-bot challenges that are intended to restrict automation.

Context-Inappropriate Capability

High

Confidence: 96% confidence
Finding: The proxy pool is designed to fetch, verify, score, and rotate public proxies, which can conceal origin and evade rate limits or blocking. Combined with browser automation, this creates a clear evasion capability that increases anonymity and complicates attribution during abusive collection activities.

Intent-Code Divergence

Medium

Confidence: 93% confidence
Finding: The implementation contradicts its documented safety property: URLs are grouped by domain, but tasks are still submitted independently to the thread pool with no per-domain lock or queue. This can cause concurrent access to the same site and shared session state, leading to race conditions, cookie corruption, inconsistent authentication state, and unintended request bursts against one target.

Intent-Code Divergence

High

Confidence: 98% confidence
Finding: SessionManager advertises session reuse, but _collect_one closes the BrowserContext after each request while leaving it referenced in the session map. Subsequent reuse can operate on a closed context, breaking isolation assumptions and potentially causing crashes, failed collections, or corrupted cookie/session handling across tasks.

Context-Inappropriate Capability

Medium

Confidence: 88% confidence
Finding: The collector is explicitly built to inject anti-detection browser evasion logic, masking automation indicators and altering browser fingerprints. In an unknown-purpose skill, stealth capability materially increases abuse potential for unauthorized scraping, evasion of site defenses, and policy circumvention.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: The process-worker mode is designed for scale, isolation, and stronger anti-bot resistance, which substantially increases the operational capability of the scraper. In the absence of a clearly justified benign context, this makes the skill more dangerous by enabling persistent, resilient, higher-volume collection against external services.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The module explicitly combines anti-detection, CAPTCHA solving, proxy use, and cookie persistence, which materially increases its ability to bypass site anti-bot controls and sustain unauthorized scraping sessions. In a general-purpose collector, these features move beyond neutral automation into defense evasion, making abuse easier against third-party services.

Intent-Code Divergence

Medium

Confidence: 92% confidence
Finding: The class-level API presents itself as a generic collector while embedding stealth behavior and persistent session reuse, which can conceal automation and preserve authenticated state across runs. This combination meaningfully lowers the barrier to large-scale scraping or access using retained sessions, especially when used against sites that deploy anti-automation protections.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: This code goes beyond passive OCR and actively automates slider-captcha interaction via Playwright, including replaying synthesized human-like movement tracks. In a general-purpose agent skill, that enables CAPTCHA bypass behavior against third-party services, which is commonly disallowed and can be abused for automated account creation, scraping, or anti-bot evasion.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The default encryption key is derived deterministically from the machine's MAC/address-like identifier via uuid.getnode() and truncated SHA-256 output, so it is not a random secret. Anyone with local system access or knowledge of the derivation method can reproduce the key and decrypt persisted cookie values, which is especially dangerous because this module stores authentication/session cookies on disk.

Intent-Code Divergence

Medium

Confidence: 97% confidence
Finding: The finding is valid because stealth_context silently changes security posture by setting ignore_https_errors=True, which disables TLS certificate validation for all browsing done through that context. In a browser automation/stealth module, this is especially risky because it can expose sessions, credentials, and scraped data to man-in-the-middle interception while misleading users with a docstring that says the helper is equivalent to new_context plus apply_stealth.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The cookie database and export features handle persistent session material across domains without clear warnings or guardrails for sensitive authentication data. Storing and exporting cookies can expose active sessions, enable account takeover if mishandled, and expand the blast radius when combined with multi-domain reuse.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The documentation promotes fetching and validating proxies from untrusted public sources but does not warn that such proxies can intercept, log, or manipulate traffic. This omission increases the likelihood of unsafe deployment and can expose credentials, cookies, and collected data to third parties.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The skill explicitly advertises CAPTCHA solving capabilities but provides no warning about legal, ethical, or site-policy constraints. In a browser automation and scraping tool, this materially increases misuse risk because it facilitates bypassing access controls intended to distinguish humans from automated collection.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The documentation promotes anti-detection and proxy-pool features without any compliance, privacy, or abuse warning. In the context of a scraping framework, these features are directly associated with evading rate limits, masking origin, and reducing accountability, which makes unauthorized large-scale collection or policy evasion more dangerous.

Missing User Warnings

Medium

Confidence: 85% confidence
Finding: The adapter exposes a POST-based `create_issue` operation that changes remote state without any built-in confirmation, dry-run safeguard, or clear warning at the call boundary. In an agent/skill context, this raises the risk of unintended external actions, spam, or unauthorized workflow modification if higher-level code invokes it with a valid token.

Natural-Language Policy Violations

Medium

Confidence: 78% confidence
Finding: Hard-coding a Chinese-first language fingerprint changes the browser identity without user consent and can misrepresent the operator's regional/browser profile. While not as severe as code-execution flaws, it is a privacy, transparency, and impersonation concern that also supports stealthier automation behavior.

Natural-Language Policy Violations

Medium

Confidence: 81% confidence
Finding: Forcing zh-CN locale in every browser context overrides user choice and creates a fixed regional fingerprint that may be misleading or unnecessary for the task. In combination with stealth features, this strengthens evasion/impersonation behavior rather than serving a clearly legitimate need.

Natural-Language Policy Violations

Medium

Confidence: 80% confidence
Finding: The worker stealth script forces Chinese-preferred languages for all automated sessions, which manipulates browser fingerprinting without opt-in. This is primarily a privacy and transparency issue, but in this context it also contributes to anti-detection behavior that makes the skill more suspicious.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: Cookies are persisted to disk automatically under a predictable user directory, which can retain authentication tokens and other sensitive session data without strong user awareness or protection. If the local system or application storage is exposed, those cookies could be reused to impersonate accounts or access protected content.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The export command prints decrypted cookies, including session-bearing values, directly to stdout. In real environments stdout is often captured by terminals, shell history, CI logs, process supervisors, or remote session tooling, so this can leak authentication material and enable session hijacking.

Ssd 4

High

Confidence: 98% confidence
Finding: The architecture intentionally combines CAPTCHA solving, stealth, proxy rotation, and persistent cookie reuse into a cumulative workflow for resilient automated collection. Even if each component could have isolated legitimate uses, their integration here significantly increases offensive capability by enabling sustained evasion of anti-abuse controls.

Ssd 4

High

Confidence: 95% confidence
Finding: The stealth module normalizes escalation from basic to aggressive fingerprint concealment, framing covert behavior as a configurable feature. This increases danger because it encourages progressively deeper evasion rather than transparent, policy-compliant automation.

VirusTotal

2/66 vendors flagged this skill as malicious, and 64/66 flagged it as clean.

View on VirusTotal