Install
openclaw skills install clip-to-your-vaultUniversal web clipper for Obsidian Vault. Saves content from X/Twitter, WeChat, Douyin, Xiaohongshu, GitHub, and generic web pages. Triggers when user sends a URL and asks to save/clip/收藏, or says '存下来'.
openclaw skills install clip-to-your-vaultUniversal clipper — saves URLs from any platform to your Obsidian Vault with local media, tags, and wikilinks.
On first run, read config.yml from the same directory as this SKILL.md file. If missing, tell the user to copy config.yml.example to config.yml and fill in vault.base_path.
Key paths derived from config:
BASE = config.vault.base_path
ATTACHMENTS = BASE / config.vault.attachments_dir
X_DIR = BASE / config.vault.dirs.x
WECHAT_DIR = BASE / config.vault.dirs.wechat
XHS_DIR = BASE / config.vault.dirs.xiaohongshu
DOUYIN_DIR = BASE / config.vault.dirs.douyin
GITHUB_DIR = BASE / config.vault.dirs.github
WEB_DIR = BASE / config.vault.dirs.web
Match the URL and dispatch to the correct handler:
| URL pattern | Handler |
|---|---|
x.com/* or twitter.com/* | X Handler |
mp.weixin.qq.com/* | WeChat Handler |
xhslink.com/* or xiaohongshu.com/* | Xiaohongshu Handler |
v.douyin.com/* or douyin.com/video/* | Douyin Handler |
github.com/{owner}/{repo} (no deeper path) | GitHub Handler |
| Everything else | Web Handler |
These apply to ALL handlers:
/\:*?"<>| and all emoji / special Unicodeclipping tag. or spaces in tags (llms.txt → llms-txt, Claude Code → Claude-Code)ATTACHMENTS/{title-slug}-{n}.{ext} (slug ≤20 chars, no emoji){title-slug}.mp4 or {title-slug}-video-{n}.mp4![[filename.ext]]github.com/{owner}/{repo} link, auto-trigger the GitHub Handler to create a GitHub note, then add bidirectional wikilinksSaves X (Twitter) posts and articles.
curl -s "https://api.fxtwitter.com/{handle}/status/{tweet_id}"
Extract from URL: x.com/{handle}/status/{id} → handle, tweet_id.
Fields:
tweet.author.name → author nametweet.author.screen_name → @handletweet.text → post body (short posts)tweet.article → long-form article (if present)tweet.article.title → article titletweet.article.content.blocks → structured content (Draft.js format)tweet.created_at → publish datetweet.likes / tweet.retweets / tweet.views → engagementtweet.media → short post imagestweet.article.media_entities → article inline imagesShort post (tweet.article is null): use tweet.text, images from tweet.media.photos.
Article (tweet.article exists): parse tweet.article.content.blocks (Draft.js):
| block type | Markdown |
|---|---|
unstyled | paragraph |
header-one | # heading |
header-two | ## heading |
header-three | ### heading |
unordered-list-item | - item |
ordered-list-item | 1. item |
atomic | insert image ![[filename]] |
blockquote | > quote |
Inline styles: Bold → **text**, Italic → *text*.
For atomic blocks: entityMap → value.data.mediaItems[].mediaId → match media_entities → media_info.original_img_url.
Download all images to ATTACHMENTS/.
Write to X_DIR/{title}.md:
Short post:
---
title: "{author name}的推文 - {first 30 chars}"
author: "{author name}"
handle: "@{screen_name}"
source: {original URL}
date: {publish date YYYY-MM-DD}
tags:
- clipping
- {auto tags}
---
# {author name}的推文
> X: @{handle} ({name}) | {date} | {likes} likes · {retweets} retweets · {views} views
{body text}
{images ![[filename]]}
Article:
---
title: "{article.title}"
author: "{author name}"
handle: "@{screen_name}"
source: {original URL}
date: {publish date YYYY-MM-DD}
tags:
- clipping
- {auto tags}
---
# {article.title}
> X: @{handle} ({name}) | {date} | {likes} likes · {retweets} retweets · {views} views
{parsed Markdown body with ![[local images]]}
vxtwitter.com as fallbackarticle.title; short posts use {name}-{first 15 chars of text}Saves WeChat Official Account (公众号) articles.
curl -s "https://down.mptext.top/api/public/v1/download?url={URL-encoded-link}&format=json"
Extract: title, nick_name, create_time, content_noencode, link.
If API returns 204 or fails, fall back to defuddle or WebFetch.
If content has images or complex formatting:
curl -s "https://down.mptext.top/api/public/v1/download?url={URL-encoded-link}&format=html"
Extract image URLs and formatted content from HTML.
ATTACHMENTS/mmbiz.qpic.cn/mmbiz_png/... → .png, mmbiz_jpg/... → .jpg![[filename.jpg]]Write to WECHAT_DIR/{title}.md:
---
title: "{article title}"
author: "{公众号 name}"
source: {original link}
date: {publish date YYYY-MM-DD}
tags:
- clipping
- {auto tags}
---
# {article title}
> 公众号:{name} | {publish date}
{body with ![[local images]]}
Saves 小红书 notes (images + video).
If URL is xhslink.com, follow redirects to get the real URL with full query parameters (especially xsec_token):
curl -sL "<short-link>" -o /dev/null -w "%{url_effective}"
Important: The full query string is required. Bare URLs return empty
noteDetailMap.
Xiaohongshu is an SPA, but SSR embeds data in window.__INITIAL_STATE__:
curl -sL "<full-URL-with-params>" \
-H "User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36" \
-H "Accept: text/html" | python3 -c "
import sys, re, json
html = sys.stdin.read()
m = re.search(r'window\.__INITIAL_STATE__\s*=\s*(\{.*?\})\s*</script>', html, re.DOTALL)
if not m:
print('NO_DATA')
sys.exit(1)
raw = m.group(1).replace(':undefined', ':null')
d = json.loads(raw)
note_map = d.get('note', {}).get('noteDetailMap', {})
for nid, detail in note_map.items():
n = detail.get('note', {})
result = {
'title': n.get('title', ''),
'desc': n.get('desc', ''),
'type': n.get('type', ''),
'author': n.get('user', {}).get('nickname', ''),
'time': n.get('time', ''),
'images': [],
'video_url': None,
'tags': [t.get('name', '') for t in n.get('tagList', [])]
}
for img in n.get('imageList', []):
url = img.get('urlDefault', '') or img.get('urlPre', '') or img.get('url', '')
if url: result['images'].append(url)
video = n.get('video', {})
if video:
media = video.get('media', {})
stream = media.get('stream', {})
for codec in ['h264', 'h265', 'h266', 'av1']:
streams = stream.get(codec, [])
if streams:
result['video_url'] = streams[0].get('masterUrl', '')
break
print(json.dumps(result, ensure_ascii=False))
"
curl -L -o each image to ATTACHMENTS/curl -L -o to ATTACHMENTS/{title-slug}-video.mp4Write to XHS_DIR/{title}.md:
---
title: "{title}"
author: "{author}"
source: {real URL}
domain: xiaohongshu.com
date: {today YYYY-MM-DD}
tags:
- clipping
- {tags from tagList, 2-4}
---
# {title}
> 来源:小红书 | {author} | {publish date}
![[cover-image.jpg]]
![[video.mp4]] (if video exists)
{desc body, strip [话题]# markers}
## 标签
{tags joined with /}
--proxy {config.xiaohongshu.proxy}desc often contains [话题]#tag[话题]# markers — strip them for clean textSaves 抖音 videos. Requires douyin-downloader tool — check config.douyin.enabled first. If disabled, tell the user to install douyin-downloader and enable it in config.
From share text like 7.94 复制打开抖音...https://v.douyin.com/xxxxx/ DUL:/..., extract the v.douyin.com short link.
Resolve to full video ID:
curl -sI -L --max-redirs 5 "https://v.douyin.com/xxxxx/" 2>&1 | grep -i location | grep '/video/' | sed 's|.*/video/\([0-9]*\).*|\1|' | head -1
Construct: https://www.douyin.com/video/{aweme_id}
{config.douyin.tool_path}/config.yml: set link to the full URLcd {config.douyin.tool_path} && {config.douyin.python} run.py
Downloaded/ — .mp4, _cover.jpg, _data.jsonFrom _data.json:
desc → description (title + hashtags)author.nickname → authorstatistics.digg_count → likesstatistics.comment_count → commentsstatistics.share_count → sharescreate_time → Unix timestamp → convert to YYYY-MM-DDCopy video and cover to ATTACHMENTS/.
Write to DOUYIN_DIR/{title}.md:
---
title: "{video title}"
source: https://www.douyin.com/video/{aweme_id}
author: "{author}"
platform: 抖音
digg: {likes}
comment: {comments}
share: {shares}
created: {publish date YYYY-MM-DD}
date: {today YYYY-MM-DD}
tags:
- clipping
- 抖音
- {auto tags}
---
# {video title}
> {original desc with hashtags}
## 为什么收藏
{inferred or asked}
## 视频
![[{video-file}.mp4]]
## 封面
![[{cover-file}_cover.jpg]]
cd {config.douyin.tool_path} && {config.douyin.python} -m tools.cookie_fetcher --config config.yml
create_time is Unix timestamp: python3 -c "import datetime; print(datetime.datetime.fromtimestamp({ts}).strftime('%Y-%m-%d'))"Saves GitHub repositories.
curl -s "https://api.github.com/repos/{owner}/{repo}"
Extract: name, full_name, description, stargazers_count, language, license.spdx_id, topics, created_at, html_url, homepage.
curl -s "https://api.github.com/repos/{owner}/{repo}/readme"
Decode base64 content to get README markdown.
Do NOT copy the full README. Extract:
Write to GITHUB_DIR/{repo-name}.md:
---
title: "{repo name}"
source: {repo URL}
author: "{owner}"
stars: {star count}
language: "{primary language}"
license: "{license}"
created: {repo created date YYYY-MM-DD}
date: {today YYYY-MM-DD}
tags:
- clipping
- github
- {auto tags from topics}
---
# {repo name}
> {one-line description}
## 为什么收藏
{inferred or asked}
## 核心功能
- {point 1}
- {point 2}
- ...
## 快速开始
{minimal code block}
## 备注
{limitations, dependencies, notes}
1.2k formatting — for Dataview sorting)name not full_name for filenameSaves generic web pages. Fallback for URLs that don't match any platform handler.
defuddle parse <url> --json
Extract: title, author, content (HTML), description, published, domain.
If defuddle is not installed, run npm install -g defuddle-cli first.
If content is empty or just <body></body> (client-rendered SPA), try fallback paths in order:
Path A: CDP browser (if config.web.cdp_enabled):
curl -s "{config.web.cdp_url}/new?url=<URL>" → get targetIdcurl -s "{cdp_url}/scroll?target=<ID>&direction=bottom"curl -s -X POST "{cdp_url}/eval?target=<ID>" -d 'document.title'curl -s -X POST "{cdp_url}/eval?target=<ID>" -d "document.querySelector('article')?.innerHTML || document.querySelector('main')?.innerHTML || document.body.innerHTML"curl -s -X POST "{cdp_url}/eval?target=<ID>" -d "[...document.querySelectorAll('img')].map(i=>i.src).filter(s=>s.startsWith('http'))"curl -s -X POST "{cdp_url}/eval?target=<ID>" -d "[...document.querySelectorAll('video source, video')].map(v=>v.src||v.querySelector('source')?.src).filter(Boolean)"curl -s "{cdp_url}/close?target=<ID>"Path B: WebFetch — use the WebFetch tool as fallback.
Path C: Bookmark mode — if URL is a web app/tool/dashboard or all paths fail, create a bookmark note from meta info only.
Download images and videos to ATTACHMENTS/, replace with ![[filename]].
Write to WEB_DIR/{title}.md:
Content page:
---
title: "{page title}"
author: "{author}"
source: {URL}
domain: "{domain}"
date: {today YYYY-MM-DD}
tags:
- clipping
- {auto tags}
---
# {page title}
> 来源:{domain} | {author} | {publish date}
{Markdown body with ![[local media]]}
Bookmark page (tool/app/SPA):
---
title: "{page title}"
author: "{author}"
source: {URL}
domain: "{domain}"
date: {today YYYY-MM-DD}
tags:
- clipping
- {auto tags}
---
# {page title}
> 来源:{domain} | {URL}
{description}
## 为什么收藏
{inferred or asked}
## 核心功能
- {points from meta info}