Amazon Scraper

Security checks across malware telemetry and agentic risk

Overview

This skill appears to do what it advertises, but it needs Review because it combines broad arbitrary web scraping with stealth browser automation and built-in proxy credentials.

Install only if you intentionally need authorized scraping and are comfortable with stealth automation and third-party proxy use. Replace or remove the bundled proxies, avoid private/internal or sensitive URLs, keep outputs in a dedicated mounted directory, and review target-site terms and data-handling obligations before use.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Findings (8)

Description-Behavior Mismatch

Medium
Confidence
93% confidence
Finding
The handler explicitly supports a 'generic' mode for arbitrary URLs rather than staying constrained to Amazon-specific pages. In an agent skill, this expands the tool from a narrowly scoped Amazon scraper into a general-purpose web extraction capability, increasing the risk of policy bypass, unauthorized data collection, or misuse against unrelated sites.

Context-Inappropriate Capability

Medium
Confidence
90% confidence
Finding
The script loads proxy endpoints and credentials from a local config file and environment variables, then uses them to rotate outbound identity. In context, this is more dangerous because the skill description explicitly advertises bypassing headless detection, so proxy support materially enables stealthier scraping and obscures traffic origin.

Context-Inappropriate Capability

Medium
Confidence
87% confidence
Finding
The output path is taken from CLI input and may be absolute, allowing writes to arbitrary filesystem locations accessible to the container or runtime user. In an agent environment, that broader file-write primitive can overwrite unexpected files, clobber shared workspace artifacts, or facilitate data staging outside the intended extraction flow.

Vague Triggers

High
Confidence
96% confidence
Finding
The trigger list is extremely broad and includes common terms like 'fetch data', '网页内容', '网站数据', and generic market-research phrases, making accidental invocation likely. In context, this is more dangerous because the skill performs remote scraping with stealth/proxy support, so an unrelated user request could be silently routed into networked scraping behavior.

Vague Triggers

High
Confidence
97% confidence
Finding
The generic-mode rule activates on any non-Amazon URL or any mention of scraping arbitrary web content, which is too permissive for a capability that fetches external pages via built-in proxies. This broad routing increases the chance of unintended access to sensitive, internal, or user-supplied URLs and expands the skill from a niche Amazon scraper into a general stealth web fetcher.

Missing User Warnings

Medium
Confidence
90% confidence
Finding
The examples encourage writing scraped output to a host-mounted directory (`~/scrapes:/data`) without a clear warning that local files will be created and may contain sensitive data. This can surprise users, leave residual scraped content on disk, and increase exposure if the output includes personal, proprietary, or regulated information.

Missing User Warnings

Medium
Confidence
95% confidence
Finding
The skill advertises built-in proxies and shows credential-bearing proxy environment variables, but it does not warn about trust, logging, jurisdiction, or secret-handling risks. This is dangerous because all scraped traffic may transit third-party infrastructure, and proxy credentials passed via env vars may be exposed through logs, shell history, or orchestration metadata.

Missing User Warnings

Medium
Confidence
95% confidence
Finding
The script accepts an arbitrary URL from the command line and performs a live browser fetch, then extracts and outputs page text without any built-in confirmation, allowlist, or user-facing disclosure. In an agent setting, this can cause unintended contact with third-party sites and collection of remote content, which is especially sensitive here because the skill is explicitly designed for scraping and anti-detection evasion.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal