Web Scraper Skill

Security checks across malware telemetry and agentic risk

Overview

This is a coherent web-scraping helper, but it gives agents broad automatic scraping guidance and does not clearly warn users about third-party data transfer and authorization risks.

Install only if you are comfortable with an agent using Firecrawl or Apify for web scraping. Use it only on sites you are authorized to scrape, avoid private/internal/authenticated pages unless approved, store API tokens outside the examples, and require explicit confirmation before any crawl, batch scrape, or social/e-commerce actor run.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code

Findings (11)

Vague Triggers

High

Confidence: 96% confidence
Finding: The skill’s trigger criteria are excessively broad and include mandates like 'always use this skill' for loosely defined web-data tasks. This can cause the agent to invoke third-party scraping workflows in situations where they are unnecessary or inappropriate, increasing the chance of collecting sensitive data, violating site policies, or taking external actions without adequate user confirmation.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The guidance provides detailed scraping and crawling instructions but omits clear user-facing warnings about privacy, terms-of-service, authorization, robots.txt expectations, and operational impact on target sites. In context, this makes unsafe use more likely because the skill encourages broad scraping behavior without meaningful safeguards or consent checkpoints.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The examples send URLs, queries, and authentication tokens to third-party services without any accompanying disclosure that user-supplied inputs and credentials will leave the local environment. In a skill explicitly designed for web scraping, this behavior is expected, but omitting a user-facing warning can still cause unintentional external data exposure, especially if users provide sensitive targets, queries, or internal URLs.

Missing User Warnings

Low

Confidence: 84% confidence
Finding: The file-save example persists scraped results to a local file without warning that collected data may contain sensitive or copyrighted material and will remain on disk after execution. This increases the risk of unintended retention, later disclosure, or mishandling of scraped data.

External Transmission

Medium

Category: Data Exfiltration
Content: API_KEY = "fc-YOUR_API_KEY" response = requests.post( "https://api.firecrawl.dev/v2/scrape", headers={"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"}, json={
Confidence: 78% confidence
Finding: requests.post( "https://

External Transmission

Medium

Category: Data Exfiltration
Content: API_KEY = "fc-YOUR_API_KEY" response = requests.post( "https://api.firecrawl.dev/v2/scrape", headers={"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"}, json={
Confidence: 78% confidence
Finding: requests.post( "https://api.firecrawl.dev/v2/scrape", headers={"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"}, json=

External Transmission

Medium

Category: Data Exfiltration
Content: HEADERS = {"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"} # Start crawl resp = requests.post(f"{BASE}/crawl", headers=HEADERS, json={ "url": "https://docs.example.com", "limit": 50, "scrapeOptions": {"formats": ["markdown"]}
Confidence: 87% confidence
Finding: requests.post(f"{BASE}/crawl", headers=HEADERS, json=

External Transmission

Medium

Category: Data Exfiltration
Content: ### Search + Scrape ```python resp = requests.post(f"{BASE}/search", headers=HEADERS, json={ "query": "best Python web scraping libraries 2025", "limit": 5, "scrapeOptions": {"formats": ["markdown"]}
Confidence: 86% confidence
Finding: requests.post(f"{BASE}/search", headers=HEADERS, json=

External Transmission

Medium

Category: Data Exfiltration
Content: BASE = "https://api.apify.com/v2" # Start run resp = requests.post( f"{BASE}/acts/{ACTOR}/runs", params={"token": TOKEN}, json={"queries": "web scraping", "maxPagesPerQuery": 1}
Confidence: 89% confidence
Finding: requests.post( f"{BASE}/acts/{ACTOR}/runs", params={"token": TOKEN}, json=

External Transmission

Medium

Category: Data Exfiltration
Content: ### Synchronous Run (short jobs <5 min) ```python resp = requests.post( f"{BASE}/acts/{ACTOR}/run-sync-get-dataset-items", params={"token": TOKEN}, json={"queries": "Jaipur restaurants", "maxPagesPerQuery": 1}
Confidence: 88% confidence
Finding: requests.post( f"{BASE}/acts/{ACTOR}/run-sync-get-dataset-items", params={"token": TOKEN}, json=

External Transmission

Medium

Category: Data Exfiltration
Content: ```javascript const API_KEY = "fc-YOUR_API_KEY"; const BASE = "https://api.firecrawl.dev/v2"; // Scrape const res = await fetch(`${BASE}/scrape`, {
Confidence: 81% confidence
Finding: https://api.firecrawl.dev/

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal