Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

Scrapling Web Fetch

使用 Scrapling + html2text 获取现代网页正文内容,支持微信公众号文章抓取与尾部噪音清洗,减少无用信息与 token 消耗;适合抓取博客、新闻、公告及许多普通 fetch 不稳定、存在反爬或动态渲染干扰的网页。Supports WeChat article cleanup, markdown...

MIT-0 · Free to use, modify, and redistribute. No attribution required.
0 · 325 · 3 current installs · 3 all-time installs
by晨冬@jllyzzd2023
MIT-0
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
Name/description match the code and runtime instructions: the script fetches pages, selects likely article containers, converts to Markdown, cleans WeChat noise, supports batch mode and site overrides. No unrelated credentials, binaries, or paths are required.
Instruction Scope
SKILL.md instructs running the included Python script and describes inputs/outputs. The script only reads files explicitly passed by the user (--batch, --selectors) and fetches the provided URLs. It does not attempt to read arbitrary system files or environment variables.
Install Mechanism
There is no install spec (instruction-only), which is low-risk. However the skill recommends installing two PyPI packages (scrapling, html2text) via pip. Installing unknown third-party packages can execute arbitrary code at install/run time; the package 'scrapling' is not further documented here, so evaluate that package before installing.
Credentials
The skill requests no environment variables, no credentials, and no config paths. The script only reads user-supplied batch or selectors files and writes output to stdout/stderr as expected.
Persistence & Privilege
The skill does not request persistent or privileged presence (always:false). It does not modify other skills or system-wide configuration.
What to consider before installing
This skill appears to do what it says: fetch pages, extract body text, convert to Markdown, and clean WeChat noise. Before installing or running it, review the PyPI package 'scrapling' (and its homepage/source) because the script depends on it; a malicious or overly-permissive package could perform network calls or execute code. Run pip installs in a virtualenv or sandbox, inspect installed package source, and avoid passing sensitive local file paths to --batch or --selectors (the script will read those files). If you need stronger assurance, ask for the 'scrapling' package source or use an alternative extractor implemented with well-known libraries (requests + readability / newspaper / browser automation) whose behavior you can audit.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.0.2
Download zip
latestvk97fpqsvdh7e316r43j40d103s82h8ke

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

SKILL.md

Scrapling Web Fetch

当用户要获取网页内容、正文提取、把网页转成 markdown/text、抓取文章主体时,优先使用此技能。

默认流程

  1. 使用 python3 scripts/scrapling_fetch.py <url> <max_chars>
  2. 默认正文选择器优先级:
    • article
    • main
    • .post-content
    • [class*="body"]
  3. 命中正文后,使用 html2text 转 Markdown
  4. 若都未命中,回退到 body
  5. 最终按 max_chars 截断输出

用法

python3 /Users/zzd/.openclaw/workspace/skills/scrapling-web-fetch/scripts/scrapling_fetch.py <url> 30000

依赖

优先检查:

  • scrapling
  • html2text

若缺失,可安装:

python3 -m pip install scrapling html2text

输出约定

脚本默认输出 Markdown 正文内容。 如需结构化输出,可追加 --json。 如需调试提取命中了哪个 selector,可查看 stderr 输出。

附加资源

  • 用法参考:/Users/zzd/.openclaw/workspace/skills/scrapling-web-fetch/references/usage.md
  • 选择器策略:/Users/zzd/.openclaw/workspace/skills/scrapling-web-fetch/references/selectors.md
  • 统一入口:/Users/zzd/.openclaw/workspace/skills/scrapling-web-fetch/scripts/fetch-web-content

何时用这个技能

  • 获取文章正文
  • 抓博客/新闻/公告正文
  • 将网页转成 Markdown 供后续总结
  • 常规 fetch 效果差,希望提升现代网页抓取稳定性

何时不用

  • 需要完整浏览器交互、点击、登录、翻页时:改用浏览器自动化
  • 只是简单获取 API JSON:直接请求 API 更合适

Files

6 total
Select a file
Select a file to preview.

Comments

Loading comments…