content-extraction

OpenClaw-native executable content extraction skill for URLs, Feishu, YouTube, and web pages.

Install

openclaw skills install content-extraction

Content Extraction — Executable Skill

This skill is the local executable version. It keeps the source-aware routing design and restores a concrete extraction workflow.

What it does

Detects the input source
Selects the best extraction channel
Produces clean Markdown
Saves long content locally when needed
Explains fallback failures instead of hiding them

Main entrypoints

scripts/extract_router.py — classify input and build a route plan
scripts/extract.py — generate an executable extraction spec

Route priorities

WeChat → browser chain
Feishu doc/wiki → Feishu tools
YouTube → transcript chain
Generic URL → r.jina.ai → defuddle.md → web_fetch → browser fallback

Output contract

Always return:

title
author when available
source
url
summary
Markdown body
save path when content is long

Fallback rule

Never claim success when extraction is partial. If a layer fails, report:

where it failed
why it failed
what fallback was tried next

Notes

The ClawHub abstracted package stays abstract.
This local version restores the executable workflow for OpenClaw use and ClawDex publishing.