Content Fetch Skill
ReviewAudited by ClawScan on May 14, 2026.
Overview
This appears to be a real webpage scraping tool, but it needs review because it uses logged-in account sessions and anti-bot/Cloudflare-evasion browser automation.
Install only if you intentionally want a Playwright-based scraper that may use proxies and logged-in site sessions. Use it only for content you are allowed to collect, avoid command-line passwords, provide only the minimum cookies needed, and delete cookie/session/output files after use.
Findings (5)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
Using the skill may cause the agent to operate a browser and proxies in ways that target sites treat as bot evasion, which can put accounts, IP addresses, or site access at risk.
This documents proxy-based handling specifically to get around Cloudflare blocking, indicating bot-protection evasion as part of the workflow rather than only normal content fetching.
Cloudflare 拦截... 解决方案: - 使用住宅 IP 代理 ... 尝试切换不同节点
Use only on content you are allowed to archive, require explicit approval for target URL/proxy/cookie use, and consider removing or gating anti-evasion behavior.
A user could expose account credentials while trying to use the skill, especially if they copy the documented password-based command.
The troubleshooting guide includes a command-line account password/email flow; command-line secrets can be exposed through shell history or process listings, and the registry metadata declares no primary credential.
--username timy530 \ --password "your_password" \ --email "your_email@example.com"
Avoid passing real passwords on the command line; prefer narrowly scoped exported cookies or a dedicated account, and update metadata/docs to clearly declare credential handling.
If you provide cookies, the browser automation may act as your logged-in session for those sites and may capture private or personalized page content in local outputs.
The skill discloses that it uses user-provided session cookies for logged-in Twitter/X and Zhihu scraping; this is purpose-aligned but grants account-level session access to the automated browser.
Twitter、知乎需要登录态... Twitter Cookie → `x_cookie.json` ... 知乎 Cookie → `zhihu_cookie.json`
Use only cookies you intend to provide, preferably from a dedicated account, and delete cookie/session files and scrape outputs when no longer needed.
Login/session state may remain on disk after a scrape and be reused by later runs.
The Twitter scraper supports reusable local session state. This is disclosed and not shown as a background process, but it can keep account session material available across runs.
session_file: str = './x_session.json' ... 会话文件路径,用于保存和加载登录状态
Store session files in a controlled location, delete them after use if not needed, and do not share the skill directory with others.
Installing the skill changes the local Python/browser environment and relies on external package sources.
The install path uses external packages and downloads a browser runtime. This is expected for the stated Playwright scraping purpose, but users should still be aware of the local dependency footprint.
pip install playwright pyyaml playwright install chromium
Install in an isolated environment and pin/review package versions if reproducibility matters.
