网页内容提取小助手

v1.0.3

从网页URL中提取标题、正文、图片链接等内容

0· 126·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for shuishouxinboda/jiayinclaw-12345.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "网页内容提取小助手" (shuishouxinboda/jiayinclaw-12345) from ClawHub.
Skill page: https://clawhub.ai/shuishouxinboda/jiayinclaw-12345
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install jiayinclaw-12345

ClawHub CLI

Package manager switcher

npx clawhub@latest install jiayinclaw-12345
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description (extract titles, content, images, links) match the included script and SKILL.md. Required libraries (requests, BeautifulSoup) are appropriate for the stated purpose and no unrelated binaries or credentials are requested.
Instruction Scope
SKILL.md and the script instruct only to fetch the target URL and parse its HTML. The runtime behavior is limited to requesting the provided URL, parsing content, and returning structured data; it does not read local files, access environment variables, or POST data to external endpoints other than the target site.
Install Mechanism
There is no automated install spec (no downloads or installers), which lowers risk. The package includes a Python script and requirements.txt that expect dependencies to be installed via pip; users should ensure dependencies are installed in a controlled environment (virtualenv) before running.
Credentials
The skill requires no environment variables, credentials, or config paths. The permissions indicated (network) are proportional and necessary for fetching webpages.
Persistence & Privilege
The skill does not request always:true, does not modify other skills or system-wide settings, and does not store credentials. Autonomous invocation is allowed by default but presents no additional incoherence here.
Assessment
This appears to be a straightforward web scraper. Before installing: (1) run it in a sandboxed or virtualenv environment and review/inspect scripts (the code is short and readable); (2) only pass URLs you trust — do not use it on internal dashboards or pages containing secrets; (3) respect robots.txt and site terms; (4) install dependencies via pip in an isolated environment; (5) if you need stronger guarantees, run it with network egress controls so it can only reach target sites.

Like a lobster shell, security has layers — review code before you run it.

latestvk970rkzbxsgnhk61pny42a7ss584jnza
126downloads
0stars
4versions
Updated 2w ago
v1.0.3
MIT-0

网页内容提取器

这是一个实用的网页内容提取技能,可以从任意网页中提取结构化信息。

功能特点

  • 自动提取网页标题和元数据
  • 提取正文内容并清理HTML标签
  • 提取所有图片链接
  • 提取所有外链
  • 支持指定提取元素
  • 输出格式化JSON结果

使用方法

基本用法

技能输入:https://example.com
技能输出:{"title": "...", "content": "...", "images": [...], "links": [...]}

高级用法

  • 指定提取特定元素
  • 设置内容长度限制
  • 自定义输出格式

技术规格

  • 编程语言:Python 3
  • 依赖库:requests, beautifulsoup4
  • 网络要求:需要互联网连接

Comments

Loading comments...