Web Scraper Skill

Security checks across malware telemetry and agentic risk

Overview

This is a coherent web-scraping helper, but it gives agents broad automatic scraping guidance and does not clearly warn users about third-party data transfer and authorization risks.

Install only if you are comfortable with an agent using Firecrawl or Apify for web scraping. Use it only on sites you are authorized to scrape, avoid private/internal/authenticated pages unless approved, store API tokens outside the examples, and require explicit confirmation before any crawl, batch scrape, or social/e-commerce actor run.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
  • Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Findings (11)

Vague Triggers

High
Confidence
96% confidence
Finding
The skill’s trigger criteria are excessively broad and include mandates like 'always use this skill' for loosely defined web-data tasks. This can cause the agent to invoke third-party scraping workflows in situations where they are unnecessary or inappropriate, increasing the chance of collecting sensitive data, violating site policies, or taking external actions without adequate user confirmation.

Missing User Warnings

Medium
Confidence
90% confidence
Finding
The guidance provides detailed scraping and crawling instructions but omits clear user-facing warnings about privacy, terms-of-service, authorization, robots.txt expectations, and operational impact on target sites. In context, this makes unsafe use more likely because the skill encourages broad scraping behavior without meaningful safeguards or consent checkpoints.

Missing User Warnings

Medium
Confidence
91% confidence
Finding
The examples send URLs, queries, and authentication tokens to third-party services without any accompanying disclosure that user-supplied inputs and credentials will leave the local environment. In a skill explicitly designed for web scraping, this behavior is expected, but omitting a user-facing warning can still cause unintentional external data exposure, especially if users provide sensitive targets, queries, or internal URLs.

Missing User Warnings

Low
Confidence
84% confidence
Finding
The file-save example persists scraped results to a local file without warning that collected data may contain sensitive or copyrighted material and will remain on disk after execution. This increases the risk of unintended retention, later disclosure, or mishandling of scraped data.

External Transmission

Medium
Category
Data Exfiltration
Content
API_KEY = "fc-YOUR_API_KEY"

response = requests.post(
    "https://api.firecrawl.dev/v2/scrape",
    headers={"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"},
    json={
Confidence
78% confidence
Finding
requests.post( "https://

External Transmission

Medium
Category
Data Exfiltration
Content
API_KEY = "fc-YOUR_API_KEY"

response = requests.post(
    "https://api.firecrawl.dev/v2/scrape",
    headers={"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"},
    json={
Confidence
78% confidence
Finding
requests.post( "https://api.firecrawl.dev/v2/scrape", headers={"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"}, json=

External Transmission

Medium
Category
Data Exfiltration
Content
HEADERS = {"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"}

# Start crawl
resp = requests.post(f"{BASE}/crawl", headers=HEADERS, json={
    "url": "https://docs.example.com",
    "limit": 50,
    "scrapeOptions": {"formats": ["markdown"]}
Confidence
87% confidence
Finding
requests.post(f"{BASE}/crawl", headers=HEADERS, json=

External Transmission

Medium
Category
Data Exfiltration
Content
### Search + Scrape
```python
resp = requests.post(f"{BASE}/search", headers=HEADERS, json={
    "query": "best Python web scraping libraries 2025",
    "limit": 5,
    "scrapeOptions": {"formats": ["markdown"]}
Confidence
86% confidence
Finding
requests.post(f"{BASE}/search", headers=HEADERS, json=

External Transmission

Medium
Category
Data Exfiltration
Content
BASE = "https://api.apify.com/v2"

# Start run
resp = requests.post(
    f"{BASE}/acts/{ACTOR}/runs",
    params={"token": TOKEN},
    json={"queries": "web scraping", "maxPagesPerQuery": 1}
Confidence
89% confidence
Finding
requests.post( f"{BASE}/acts/{ACTOR}/runs", params={"token": TOKEN}, json=

External Transmission

Medium
Category
Data Exfiltration
Content
### Synchronous Run (short jobs <5 min)
```python
resp = requests.post(
    f"{BASE}/acts/{ACTOR}/run-sync-get-dataset-items",
    params={"token": TOKEN},
    json={"queries": "Jaipur restaurants", "maxPagesPerQuery": 1}
Confidence
88% confidence
Finding
requests.post( f"{BASE}/acts/{ACTOR}/run-sync-get-dataset-items", params={"token": TOKEN}, json=

External Transmission

Medium
Category
Data Exfiltration
Content
```javascript
const API_KEY = "fc-YOUR_API_KEY";
const BASE = "https://api.firecrawl.dev/v2";

// Scrape
const res = await fetch(`${BASE}/scrape`, {
Confidence
81% confidence
Finding
https://api.firecrawl.dev/

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal