Install
openclaw skills install wechat-article-extractorExtract full text and figures from a WeChat public account (微信公众号) article URL and save as a clean Markdown file. Handles WeChat's bot-detection by finding mirror sites automatically. Use when the user shares an mp.weixin.qq.com URL and asks to save, archive, extract, or read the article.
openclaw skills install wechat-article-extractorExtract WeChat public account articles to clean Markdown. WeChat blocks headless browsers (环境异常 CAPTCHA) and web_fetch gets empty JS-rendered pages, so the reliable approach is: find a mirror on aggregator sites, then extract content.
This skill handles:
.md filesThis skill does NOT handle:
| Input | Required | Description |
|---|---|---|
| WeChat URL | Yes | An mp.weixin.qq.com link |
| Output filename | No | Defaults to kebab-case of article title |
| Save location | No | Defaults to /tmp/ |
web_fetch(url, extractMode="markdown", maxChars=50000)
Success check: If result rawLength > 500 AND content has real paragraphs (not just nav/footer text) → skip to Step 4 Option B.
Failure indicators: rawLength < 500, content is navigation/boilerplate only, or contains "环境异常" → go to Step 2.
From the URL or any partial content, identify:
<title> or og:title)If metadata is unavailable from the URL, ask the user for the article title.
web_search("<article title> <author/account name>")
Mirror site priority (ranked by content quality and reliability):
If title is unknown, try: web_search("site:53ai.com <keywords from URL path>")
If no mirrors found: Try the Chrome Extension Relay fallback (see Fallback section).
Option A — Mirror found:
curl -s -L "<mirror_url>" -o /tmp/wechat-article.html
Verify file size > 10KB (smaller usually means redirect/error page).
Run the extraction script:
python3 <skill_dir>/scripts/extract_wechat.py /tmp/wechat-article.html /tmp/<output-filename>.md
Replace <skill_dir> with the directory containing this SKILL.md.
Option B — Direct fetch succeeded (Step 1): Format the fetched markdown with the header template below.
Check the output file:
If output looks truncated or garbled, try a different mirror site (return to Step 3).
Report:
<path><title><char count> characters<N> imagesIf the user wants it saved to a specific location (e.g., Obsidian), follow their instructions for the final copy.
Every extracted article must include this header:
# <title>
**作者:** <author>
**来源:** 微信公众号「<account_name>」
**日期:** <date>
**原文:** <original_wechat_url>
---
> **摘要:** <1-2 sentence summary generated from content>
---
Fields that cannot be determined should be omitted (don't write "Unknown").
If no mirror exists (very new or niche article):
Tell the user (in Chinese if they wrote in Chinese):
"没有找到镜像。请在 Chrome 中打开这篇文章,然后点击 OpenClaw Browser Relay 扩展图标(badge 亮起),我就能直接读取内容。"
Then use:
browser(action="snapshot", profile="chrome")
Extract content from the snapshot and format with the header template.
| Problem | Detection | Action |
|---|---|---|
| WeChat blocks access | rawLength < 500 or "环境异常" | Search for mirrors (Step 3) |
| No mirrors found | Search returns 0 relevant results | Try Chrome Relay fallback |
| Mirror content truncated | Output < 1000 chars when original is long | Try next mirror site |
| Script extraction fails | Python error or empty output | Fall back to web_fetch on mirror URL |
| Images broken | Image URLs return 404 | Note in output; images may expire |
· · · section dividers are WeChat style — preserve themweb_fetch may truncateNo persistent configuration required. The skill uses standard OpenClaw tools (web_fetch, web_search, exec) and optionally browser for the Chrome Relay fallback.
Required tools:
| Tool | Purpose |
|---|---|
web_fetch | Direct article fetch attempt |
web_search | Mirror site discovery |
exec | Run curl and Python extraction script |
Optional tools:
| Tool | Purpose |
|---|---|
browser | Chrome Extension Relay fallback |
System dependencies:
| Dependency | Purpose |
|---|---|
| Python 3.8+ | Extraction script |
| curl | Mirror page download |