Deep Web Fetcher

PassAudited by ClawScan on May 1, 2026.

Overview

The skill matches its stated purpose as a local Playwright-based web content extractor, with no evident hidden exfiltration or destructive behavior, but fetched pages and manual dependency installs should be treated as untrusted.

This skill appears safe for its stated scraping and extraction purpose. Before installing, use a virtual environment, review or pin the Python dependencies, remember that target websites will receive browser requests from your machine, and treat all fetched webpage text or HTML as untrusted content rather than agent instructions.

Findings (3)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

What this means

A malicious or compromised webpage could include text that tries to influence the agent if the agent treats fetched content as instructions instead of data.

Why it was flagged

The tool loads caller-supplied webpages and returns page-derived HTML/text to the agent. That content is untrusted and may contain prompt-like instructions.

Skill content
page.goto(url, wait_until="networkidle", timeout=timeout*1000) ... result["content_html"] = doc.summary()
Recommendation

Fetch only intended URLs, treat all returned webpage content as quoted untrusted data, and avoid following instructions embedded in fetched pages.

What this means

Installing dependencies from external package sources can introduce risk if the environment or package source is compromised.

Why it was flagged

The setup instructions rely on manually installing unpinned packages and downloading a browser runtime. This is expected for a Playwright scraper, but it is still external supply-chain exposure.

Skill content
pip install playwright readability-lxml lxml beautifulsoup4

# 安装浏览器驱动(首次运行需下载~100MB)
playwright install chromium
Recommendation

Install in a virtual environment, use trusted package indexes, consider pinning versions, and review dependency provenance before use.

What this means

Users may overread the privacy claim and forget that the target site can see the request, including the requested URL and network metadata.

Why it was flagged

The skill clearly visits target URLs, while the closing privacy wording says data does not leave the machine. In context this appears to mean no paid extraction API is used, but users should understand that target websites still receive requests.

Skill content
2. 访问目标URL,等待JS渲染完成 ... *完全免费,本地运行,数据不出机器*
Recommendation

Do not assume the skill is offline or anonymous; only fetch pages you are comfortable contacting from your environment.