Content Fetch Skill

ReviewAudited by ClawScan on May 14, 2026.

Overview

This appears to be a real webpage scraping tool, but it needs review because it uses logged-in account sessions and anti-bot/Cloudflare-evasion browser automation.

Install only if you intentionally want a Playwright-based scraper that may use proxies and logged-in site sessions. Use it only for content you are allowed to collect, avoid command-line passwords, provide only the minimum cookies needed, and delete cookie/session/output files after use.

Findings (5)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

What this means

Using the skill may cause the agent to operate a browser and proxies in ways that target sites treat as bot evasion, which can put accounts, IP addresses, or site access at risk.

Why it was flagged

This documents proxy-based handling specifically to get around Cloudflare blocking, indicating bot-protection evasion as part of the workflow rather than only normal content fetching.

Skill content
Cloudflare 拦截... 解决方案: - 使用住宅 IP 代理 ... 尝试切换不同节点
Recommendation

Use only on content you are allowed to archive, require explicit approval for target URL/proxy/cookie use, and consider removing or gating anti-evasion behavior.

What this means

A user could expose account credentials while trying to use the skill, especially if they copy the documented password-based command.

Why it was flagged

The troubleshooting guide includes a command-line account password/email flow; command-line secrets can be exposed through shell history or process listings, and the registry metadata declares no primary credential.

Skill content
--username timy530 \
  --password "your_password" \
  --email "your_email@example.com"
Recommendation

Avoid passing real passwords on the command line; prefer narrowly scoped exported cookies or a dedicated account, and update metadata/docs to clearly declare credential handling.

What this means

If you provide cookies, the browser automation may act as your logged-in session for those sites and may capture private or personalized page content in local outputs.

Why it was flagged

The skill discloses that it uses user-provided session cookies for logged-in Twitter/X and Zhihu scraping; this is purpose-aligned but grants account-level session access to the automated browser.

Skill content
Twitter、知乎需要登录态... Twitter Cookie → `x_cookie.json` ... 知乎 Cookie → `zhihu_cookie.json`
Recommendation

Use only cookies you intend to provide, preferably from a dedicated account, and delete cookie/session files and scrape outputs when no longer needed.

What this means

Login/session state may remain on disk after a scrape and be reused by later runs.

Why it was flagged

The Twitter scraper supports reusable local session state. This is disclosed and not shown as a background process, but it can keep account session material available across runs.

Skill content
session_file: str = './x_session.json' ... 会话文件路径,用于保存和加载登录状态
Recommendation

Store session files in a controlled location, delete them after use if not needed, and do not share the skill directory with others.

What this means

Installing the skill changes the local Python/browser environment and relies on external package sources.

Why it was flagged

The install path uses external packages and downloads a browser runtime. This is expected for the stated Playwright scraping purpose, but users should still be aware of the local dependency footprint.

Skill content
pip install playwright pyyaml
playwright install chromium
Recommendation

Install in an isolated environment and pin/review package versions if reproducibility matters.