Web Content Fetcher (WeChat images fix)
v0.0.1Extract article content from any URL as clean Markdown. Uses Scrapling script as primary method (with auto fast→stealth fallback), Jina Reader as alternative...
⭐ 1· 438·0 current·0 all-time
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
OpenClaw
Benign
high confidencePurpose & Capability
Name/description (extract article content to Markdown) align with the provided files and runtime actions. The skill ships a Python script that fetches HTML, normalizes lazy images, and converts to Markdown using html2text — all expected for a webpage extractor.
Instruction Scope
SKILL.md instructs the agent to run the included scripts/fetch.py and to prefer Scrapling with an optional Jina Reader fallback — instructions stay within the stated purpose. Note: the script will perform arbitrary HTTP(S) requests (including headless browser fetches) to whatever URL the user or agent supplies, which is expected but can fetch internal endpoints (e.g., cloud metadata or intranet) if given such URLs.
Install Mechanism
No install spec is declared (instruction-only), and dependencies are standard Python packages listed in requirements.txt. Nothing is downloaded from an unknown URL or written to disk during an automated install step. The skill does require pip installation of scrapling and html2text as documented.
Credentials
The skill requires no environment variables, secrets, or config paths. The code uses only local imports and network access to the target URLs — this is proportional to its purpose.
Persistence & Privilege
The skill is not always-enabled and does not request elevated or persistent platform privileges. It does not modify other skills or system-wide settings.
Assessment
This skill appears to be what it claims: a local Python-based webpage-to-Markdown extractor. Before installing or enabling it, consider: (1) run pip installs inside a virtualenv/isolated environment (the script requires scrapling and html2text and may need a headless browser runtime like Chromium depending on your Scrapling setup); (2) audit and sandbox use if you plan to let the agent call arbitrary URLs — the script will fetch any URL given (so avoid giving it internal/cloud-metadata URLs or other sensitive endpoints); (3) verify the upstream 'scrapling' package and ensure you trust it for headless browsing; (4) if you need offline copies with embedded images, implement the post-processing step yourself since image downloading is not included. Overall the skill is coherent and proportionate, but deploy with usual caution for code that performs network fetching and headless browser actions.Like a lobster shell, security has layers — review code before you run it.
latestvk970tp307mxtqbv4317297hhtx83f6nt
License
MIT-0
Free to use, modify, and redistribute. No attribution required.
