Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

Defuddle

v1.0.0

Extract main webpage content using Defuddle library and convert it to Markdown, supporting CLI and Node.js for web scraping and text processing tasks.

0· 228·1 current·1 all-time
byHoncy Ye@yeholdon

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for yeholdon/defuddle-extractor.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "Defuddle" (yeholdon/defuddle-extractor) from ClawHub.
Skill page: https://clawhub.ai/yeholdon/defuddle-extractor
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install defuddle-extractor

ClawHub CLI

Package manager switcher

npx clawhub@latest install defuddle-extractor
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Suspicious
medium confidence
!
Purpose & Capability
The SKILL.md and scripts clearly require Node.js/npm and the 'defuddle' npm package (and use npx). However the registry metadata lists no required binaries or environment variables — a mismatch. One bundled script calls a hardcoded path (/Users/honcy/.openclaw/skills/WeChat-Send/scripts/wechat_send.sh) which targets another skill/user-specific location that is not declared and is unlikely to exist for other users.
!
Instruction Scope
Instructions and scripts operate on arbitrary URLs and then transmit extracted content to external messaging endpoints (WeChat via a local script path and Telegram via openclaw message send). Transmitting arbitrary scraped content is consistent with the advertised 'send' scripts, but it is an exfiltration vector and the WeChat script reference expands scope to other local skill files. The SKILL.md and scripts do not declare or limit what user files or environment will be read beyond fetching URLs, but they do rely on the openclaw CLI and npx behavior.
Install Mechanism
There is no formal install spec (instruction-only), which limits on-disk installation risk by this bundle. However SKILL.md and scripts rely on 'npx defuddle' and suggest 'npm install -g defuddle' — which means runtime will fetch/execute code from the npm registry (npx executes remote packages), a supply-chain/execution vector to be aware of.
Credentials
The registry lists no required environment variables or credentials, which is consistent with the included files. But the scripts assume the availability of other platform credentials/agents: openclaw message send (Telegram) and a local WeChat helper script (which likely depends on credentials/config stored elsewhere). Those credentials are not declared by the skill and may be used implicitly when the scripts run.
Persistence & Privilege
always is false and there is no install-time persistence requested. The skill can be invoked by the agent (normal), and its scripts can send messages autonomously if run, so users should be aware of the ability to transmit extracted content but there is no elevated 'always' privilege or hidden persistence in the bundle.
What to consider before installing
This skill's core feature (extract webpage content and convert to Markdown) is coherent, but review and proceed cautiously: 1) Inspect or remove the send scripts before use — they transmit scraped content to WeChat/Telegram. 2) The WeChat helper is a hardcoded path to /Users/honcy/... which will fail on your system and could reference another skill you don't control — do not run it without validating that script. 3) npx will fetch and execute the 'defuddle' package from npm at runtime — verify the npm package's source/reputation before running. 4) If you only want extraction, run the extraction commands in a controlled environment and avoid executing the send scripts until you confirm destinations and credentials. 5) If you need this skill, consider forking/cleaning the scripts to remove hardcoded paths and to require explicit consent/targets before sending data.

Like a lobster shell, security has layers — review code before you run it.

latestvk976s9yk6qssm3ddca0969hqxd83324d
228downloads
0stars
1versions
Updated 4h ago
v1.0.0
MIT-0

name: defuddle description: 使用 Defuddle 库从任意网页提取主要内容并转换为 Markdown 格式。支持 CLI 和 Node.js 集成,用于内容爬虫、文本处理和自动化任务。 metadata: {"openclaw": {"os": ["darwin", "linux", "win32"], "author": "Honcy Ye", "email": "honcy.ye@gmail.com"}}

Defuddle 网页内容提取技能

使用 Defuddle 库从任意网页提取主要内容并转换为 Markdown 格式。

功能特性

  • 内容提取:自动检测并提取网页主要内容
  • Markdown 转换:将 HTML 内容转换为 Markdown 格式
  • 垃圾清理:移除广告、侧边栏、评论等网页垃圾
  • CLI 支持:提供命令行接口快速使用
  • Node.js 集成:支持在 Node.js 环境中使用
  • 自定义配置:支持自定义内容选择器和选项

技术实现

  • 使用 Defuddle 库进行网页内容提取
  • 支持多种配置选项
  • 提供简单易用的 API

使用方法

1. 命令行使用

# 解析 URL 并输出为 Markdown
npx defuddle parse https://example.com/article --markdown

# 解析本地 HTML 文件
npx defuddle parse page.html --markdown

# 输出为 JSON 格式(包含元数据)
npx defuddle parse page.html --json

2. 脚本使用

# 从 URL 提取内容并发送到微信文件传输助手
bash scripts/extract_and_send.sh "https://example.com/article" "文件传输助手"

# 从 URL 提取内容并发送到 Telegram
bash scripts/extract_and_send_telegram.sh "https://example.com/article" <chat_id>

3. Node.js API

import { JSDOM } from 'jsdom';
import { Defuddle } from 'defuddle/node';

async function extractContent(url) {
  const response = await fetch(url);
  const html = await response.text();
  
  const dom = new JSDOM(html, { url });
  const result = await Defuddle(dom.window.document);
  
  return {
    title: result.title,
    content: result.content,
    markdown: result.contentMarkdown
  };
}

配置选项

  • markdown: 转换为 Markdown 格式
  • debug: 启用调试模式
  • contentSelector: 自定义内容选择器
  • removeImages: 移除图片
  • removeHiddenElements: 移除隐藏元素

脚本说明

  • scripts/extract_content.sh: 从 URL 提取内容并输出到控制台
  • scripts/extract_and_send.sh: 提取内容并发送到微信
  • scripts/extract_and_send_telegram.sh: 提取内容并发送到 Telegram

依赖

  • Node.js 和 npm(用于 CLI)
  • defuddle 库(已通过 npm 安装)

安装

npm install -g defuddle

注意事项

  • Defuddle 需要 Node.js 环境(建议使用 Node.js 18 或更高版本)
  • 某些网站可能有防爬虫机制,可能导致提取失败
  • 大型网页内容提取可能需要较长时间

Comments

Loading comments...