Scraper Builder

Security checks across malware telemetry and agentic risk

Overview

This is an instruction-only Bright Data scraper-building skill with expected external API use, but users should understand it sends target URLs and fetched content to Bright Data.

Install this only if you intend to use Bright Data for scraping. Use scoped API credentials, review generated code before running it, keep validation runs small, set page and concurrency limits, and avoid private, authenticated, internal, or secret-bearing URLs unless you are comfortable sending them through Bright Data.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code

Findings (12)

Vague Triggers

Medium

Confidence: 89% confidence
Finding: The skill description is extremely broad and is designed to trigger on many ordinary requests about obtaining website data. That increases the chance the skill activates in situations where the user did not explicitly request scraping, causing unintended collection or transmission of target URLs and scraped content to Bright Data services.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The guide instructs users to export a Bright Data API key and then send target URLs and retrieved content through a third-party scraping service, but it does not explicitly warn that both credentials and scraped traffic are being transmitted off-box. In a skill designed to help build scrapers, this omission is security-relevant because users may unknowingly expose sensitive targets, proprietary URLs, or regulated content to an external provider.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The guide instructs sending user-supplied target URLs and fetched site content to Bright Data's external API, but it does not disclose that site access details and page data are being transmitted to a third party. In an agent skill context, this is a real privacy and data-governance issue because users may assume analysis happens locally while the skill exfiltrates URLs, page contents, and possibly sensitive query parameters to an external service.

External Transmission

Medium

Category: Data Exfiltration
Content: ZONE = os.environ["BRIGHTDATA_UNLOCKER_ZONE"] # Fetch raw HTML response = requests.post( "https://api.brightdata.com/request", headers={"Authorization": f"Bearer {API_KEY}"}, json={"zone": ZONE, "url": TARGET_URL, "format": "raw"}
Confidence: 95% confidence
Finding: requests.post( "https://

External Transmission

Medium

Category: Data Exfiltration
Content: html = response.text # Also fetch as markdown for a quick readable overview md_response = requests.post( "https://api.brightdata.com/request", headers={"Authorization": f"Bearer {API_KEY}"}, json={"zone": ZONE, "url": TARGET_URL, "format": "raw", "data_format": "markdown"}
Confidence: 95% confidence
Finding: requests.post( "https://

External Transmission

Medium

Category: Data Exfiltration
Content: ```python # Example: hitting a discovered API endpoint api_data = requests.post( "https://api.brightdata.com/request", headers={"Authorization": f"Bearer {API_KEY}"}, json={
Confidence: 94% confidence
Finding: requests.post( "https://

External Transmission

Medium

Category: Data Exfiltration
Content: ZONE = os.environ["BRIGHTDATA_UNLOCKER_ZONE"] # Fetch raw HTML response = requests.post( "https://api.brightdata.com/request", headers={"Authorization": f"Bearer {API_KEY}"}, json={"zone": ZONE, "url": TARGET_URL, "format": "raw"}
Confidence: 95% confidence
Finding: requests.post( "https://api.brightdata.com/request", headers={"Authorization": f"Bearer {API_KEY}"}, json=

External Transmission

Medium

Category: Data Exfiltration
Content: html = response.text # Also fetch as markdown for a quick readable overview md_response = requests.post( "https://api.brightdata.com/request", headers={"Authorization": f"Bearer {API_KEY}"}, json={"zone": ZONE, "url": TARGET_URL, "format": "raw", "data_format": "markdown"}
Confidence: 95% confidence
Finding: requests.post( "https://api.brightdata.com/request", headers={"Authorization": f"Bearer {API_KEY}"}, json=

External Transmission

Medium

Category: Data Exfiltration
Content: ```python # Example: hitting a discovered API endpoint api_data = requests.post( "https://api.brightdata.com/request", headers={"Authorization": f"Bearer {API_KEY}"}, json={
Confidence: 94% confidence
Finding: requests.post( "https://api.brightdata.com/request", headers={"Authorization": f"Bearer {API_KEY}"}, json=

External Transmission

Medium

Category: Data Exfiltration
Content: # Fetch raw HTML response = requests.post( "https://api.brightdata.com/request", headers={"Authorization": f"Bearer {API_KEY}"}, json={"zone": ZONE, "url": TARGET_URL, "format": "raw"} )
Confidence: 90% confidence
Finding: https://api.brightdata.com/

External Transmission

Medium

Category: Data Exfiltration
Content: # Also fetch as markdown for a quick readable overview md_response = requests.post( "https://api.brightdata.com/request", headers={"Authorization": f"Bearer {API_KEY}"}, json={"zone": ZONE, "url": TARGET_URL, "format": "raw", "data_format": "markdown"} )
Confidence: 90% confidence
Finding: https://api.brightdata.com/

External Transmission

Medium

Category: Data Exfiltration
Content: ```python # Example: hitting a discovered API endpoint api_data = requests.post( "https://api.brightdata.com/request", headers={"Authorization": f"Bearer {API_KEY}"}, json={ "zone": ZONE,
Confidence: 91% confidence
Finding: https://api.brightdata.com/

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal