Lobster Crawler Skill

Security checks across malware telemetry and agentic risk

Overview

This crawler matches its stated purpose, but it includes anti-bot bypass behavior, ignored robots.txt handling, and extra agent automation that users should review before installing.

Install only if you are comfortable running a crawler that may bypass site anti-bot controls and ignore robots.txt for some targets. Review the target sites' terms, keep scheduled crawling off unless needed, set DingTalk credentials only in a controlled environment, and avoid running the included Claude development loop unless you intentionally want repo-modifying automation.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Findings (4)

Lp3

Medium
Category
MCP Least Privilege
Confidence
90% confidence
Finding
The skill exposes significant capabilities—shell execution, network access, file read/write, and environment variable use—without declaring permissions or clearly constraining how those powers are used. In an agent setting, this weakens transparency and policy enforcement, making it easier for the skill to access secrets like DINGTALK_WEBHOOK, modify local files, and run external commands in ways a caller may not expect.

Tp4

High
Category
MCP Tool Poisoning
Confidence
87% confidence
Finding
The documented purpose is targeted crawling and DingTalk reporting, but the observed behavior includes additional persistence and automation features such as RSS generation, sitemap management, scheduled task loading, proxy/TLS fingerprint evasion, and an unrelated claude CLI automation loop. This mismatch is dangerous because hidden or under-disclosed functionality can be abused for long-running automated collection, stealthier network activity, and actions outside the user’s informed expectations.

Description-Behavior Mismatch

Medium
Confidence
95% confidence
Finding
The memory file explicitly records using a headless browser to bypass Cloudflare/anti-bot protections and states that robots.txt is globally ignored. Even if framed as operational troubleshooting, this materially expands the skill’s effective behavior beyond ordinary targeted crawling and enables access patterns that site operators are actively trying to block.

Context-Inappropriate Capability

Medium
Confidence
97% confidence
Finding
The documentation describes deliberate anti-crawling bypass using Crawl4ai, headless browser rendering, and Cloudflare evasion. In a scraping-focused skill, this is especially concerning because it lowers barriers to unauthorized bulk collection and can be repurposed against protected websites, increasing legal, operational, and abuse risk.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal