Lightpanda Scraper

Security checks across malware telemetry and agentic risk

Overview

This is a disclosed Lightpanda scraping/browser automation skill with powerful opt-in modes that users should handle carefully.

Install only if you trust the Lightpanda project and are comfortable running an unpinned native binary from GitHub. Review commands before running them, especially --js, --serve, --mcp, --proxy, and --output; stop any server mode when finished and avoid scraping or storing data you are not authorized to access.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Behavioral ASTexec() Call, eval() Call, Dynamic Import
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (8)

os.system() or os exec-family call

High

Category: Dangerous Code Execution
Content: args = parser.parse_args() if args.mcp: os.execv(LIGHTPANDA, [LIGHTPANDA, "mcp"]) if args.serve: os.execv(LIGHTPANDA, [LIGHTPANDA, "serve", "--host", "127.0.0.1", "--port", str(args.port)])
Confidence: 89% confidence
Finding: os.execv(LIGHTPANDA, [LIGHTPANDA, "mcp"])

os.system() or os exec-family call

High

Category: Dangerous Code Execution
Content: os.execv(LIGHTPANDA, [LIGHTPANDA, "mcp"]) if args.serve: os.execv(LIGHTPANDA, [LIGHTPANDA, "serve", "--host", "127.0.0.1", "--port", str(args.port)]) if not args.url: parser.error("URL required (unless --serve or --mcp)")
Confidence: 87% confidence
Finding: os.execv(LIGHTPANDA, [LIGHTPANDA, "serve", "--host", "127.0.0.1", "--port", str(args.port)])

Lp3

Medium

Category: MCP Least Privilege
Confidence: 91% confidence
Finding: The skill clearly performs network access, executes shell commands during installation and usage examples, and writes files, yet it declares no permissions or equivalent trust boundaries. This is dangerous because users and orchestration systems may invoke the skill without realizing it can download binaries, contact arbitrary URLs, and persist data locally.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 95% confidence
Finding: The stated purpose is scraping, but the documented behavior also includes arbitrary JavaScript evaluation, starting a CDP server, and running as an MCP server, which substantially expands the attack surface beyond passive content extraction. These features can enable active interaction, remote control, or unintended exposure of browser/session context if invoked in unsafe environments.

Description-Behavior Mismatch

Medium

Confidence: 94% confidence
Finding: The script exposes server modes (CDP and MCP) that go beyond the advertised scraping/link-extraction role. In a skill ecosystem, hidden or underdocumented control-plane features are dangerous because they broaden the operational scope from passive fetching to active browser-service exposure, increasing the chance of misuse or lateral capability expansion.

Context-Inappropriate Capability

High

Confidence: 95% confidence
Finding: Allowing arbitrary JavaScript evaluation on fetched pages is significantly broader than simple scraping and can be abused to perform stateful actions in the page context, interact with authenticated sessions, or extract data beyond a normal content dump. In an agent setting this is especially risky because it effectively grants a general browser automation primitive under the guise of a scraper.

Context-Inappropriate Capability

Medium

Confidence: 93% confidence
Finding: The code can launch CDP and MCP servers, enabling persistent browser-control workflows unrelated to a narrow scraper. This expands the trust boundary and can let other local components or users drive the browser in ways the skill description does not disclose, which is an unnecessary increase in attack surface.

Missing User Warnings

Medium

Confidence: 83% confidence
Finding: The documentation encourages scraping, proxying, JavaScript execution, and file output without warning about authorization, privacy, robots/terms constraints, sensitive data capture, or safe handling of scraped content. In an OSINT/recon context, omission of these safeguards increases the chance of misuse, overcollection, or accidental processing of protected information.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal