scrape

Security checks across malware telemetry and agentic risk

Overview

This is a disclosed web-scraping guide with expected third-party API use, but users should be careful about compliance and what content they send to SkillBoss.

Install only if you are comfortable using SkillBoss for managed scraping. Keep SKILLBOSS_API_KEY in a local secret store or environment variable, scrape only public pages you are authorized to access, verify robots.txt and site terms yourself, and do not send private pages, login-protected content, personal data, or sensitive business data into the managed scraping or LLM examples.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (3)

Description-Behavior Mismatch

Medium

Confidence: 88% confidence
Finding: The managed API example extends the skill from scraping into downstream LLM analysis, which is not clearly scoped by the manifest. That increases data-flow risk because scraped content may be transmitted to a general chat endpoint, potentially exposing third-party data and enabling prompt-injection or unintended secondary use.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: The code includes a general-purpose chat capability that is broader than the stated scraping purpose, creating a capability mismatch. This can be abused to process arbitrary content via a remote LLM service and widens the attack surface beyond simple page retrieval.

Intent-Code Divergence

Medium

Confidence: 97% confidence
Finding: The documentation claims robots.txt compliance, but the implementation returns allowed when robots.txt cannot be fetched or parsed. That fail-open behavior can cause scraping of sites that intended to restrict bots, undermining the stated legal/compliance guarantees.

VirusTotal

67/67 vendors flagged this skill as clean.

View on VirusTotal