Install
openclaw skills install @lutongsuo/web-to-wechat抓取任意网页内容,AI 智能整理格式,自动生成封面图和配图,发布到微信公众号草稿箱。支持微信公众号文章、新闻网站、技术博客、知乎、CSDN 等。Use when the user wants to scrape a web page and publish it to WeChat.
openclaw skills install @lutongsuo/web-to-wechat抓取任意网页链接的内容,AI 智能整理排版,自动生成封面图,发布到微信公众号草稿箱。
Accept requests like:
Return a published draft in the WeChat draft box, not a proposal.
python -m pip install requests beautifulsoup4 html2text markdown Pillow
This skill depends on two companion skills from ClawHub:
clawhub install anything-to-wechat
clawhub install file-to-wechat
publish_to_wechat.py (WeChat API publishing)md_to_wechat_html.py (Markdown → WeChat HTML)You need a WeChat Official Account (服务号 or 订阅号 with API access).
Get your credentials:
Set environment variables (recommended):
On macOS / Linux:
export WECHAT_APP_ID="your_appid_here"
export WECHAT_APP_SECRET="your_appsecret_here"
On Windows (PowerShell):
[Environment]::SetEnvironmentVariable("WECHAT_APP_ID", "your_appid_here", "User")
[Environment]::SetEnvironmentVariable("WECHAT_APP_SECRET", "your_appsecret_here", "User")
No environment variables? The publish script will prompt you interactively on first run.
| Dependency | Required | Purpose |
|---|---|---|
requests (pip) | Yes | HTTP fetching |
beautifulsoup4 (pip) | Yes | HTML parsing |
html2text (pip) | Yes | HTML → Markdown conversion |
markdown (pip) | Yes | Markdown → HTML (via md_to_wechat_html.py) |
Pillow (pip) | Yes | Image compression |
| anything-to-wechat skill | Yes | WeChat API publishing |
| file-to-wechat skill | Yes | Markdown → WeChat HTML conversion |
If the user has NOT provided a URL, ask using AskUserQuestion:
Question: "请提供你想抓取并发布到微信公众号的网页链接"
Options:
- "粘贴 URL"
- "搜索关键词后选择文章"
If the user already provided a URL, skip and proceed.
IMPORTANT: Always use UTF-8 encoding on Windows.
Primary: Use scrape_web.py script (structured extraction)
python "<skill_dir>/scripts/scrape_web.py" \
--url "<url>" \
--output "<workspace>/raw_article.md" \
--json
The --json flag outputs structured data including title, author, date, cover URL, and Markdown content.
Fallback: Use WebFetch tool (for JS-rendered pages)
If scrape_web.py fails (e.g., the page requires JavaScript rendering), use the built-in WebFetch tool:
WebFetch(url="<url>", prompt="Extract the full article content including: title, author, publish date, and the complete article text. Return in a structured format.")
Then manually compose the Markdown from the WebFetch output.
After scraping: Read the output. Inspect the content structure and quality. Proceed to Phase 3.
This is the key quality step. The agent should:
Style options (ask user or auto-detect):
| Style | When to use |
|---|---|
| 忠实转载 (faithful reprint) | User wants exact copy, just clean formatting |
| 精华摘要 (key highlights) | Long article → condensed version with key points |
| 深度改写 (deep rewrite) | Rewrite in user's own voice/style |
Save the reformatted Markdown:
Write the reformatted content to <workspace>/article.md using the Write tool.
Use the ImageGen tool with a prompt derived from the article's topic and content.
wechat_cover.png in the workspace.1024x768 (WeChat cover ratio 4:3).WeChat requires cover images (thumb_media_id) to be under 64KB.
python "<skill_dir>/scripts/compress_image.py" \
--input "<workspace>/wechat_cover.png" \
--output "<workspace>/wechat_cover_compressed.jpg" \
--max-size 64
If the original cover is already under 64KB (rare for PNG), this step can be skipped. But always run it to be safe — it won't enlarge files.
Fallback: If compress_image.py fails to reach 64KB, use ImageGen to regenerate a simpler cover image (fewer details, simpler composition) and try again.
Use md_to_wechat_html.py from the file-to-wechat skill:
python "<file-to-wechat_skill_dir>/scripts/md_to_wechat_html.py" \
--input "<workspace>/article.md" \
--output "<workspace>/wechat_article.html" \
--title "<article_title>"
This generates WeChat-compatible inline-style HTML with Clockless design tokens.
On Windows, use Python subprocess to pass environment variables:
python -c "
import os, subprocess, sys
os.environ['WECHAT_APP_ID'] = '<app_id>'
os.environ['WECHAT_APP_SECRET'] = '<app_secret>'
result = subprocess.run([
sys.executable,
r'<anything-to-wechat_skill_dir>/scripts/publish_to_wechat.py',
'--file', r'<workspace>/wechat_article.html',
'--title', '<article_title>',
'--cover', r'<workspace>/wechat_cover_compressed.jpg',
'--digest', '<article_summary_under_120_chars>',
'--source-url', '<original_url>'
], capture_output=True, text=True, encoding='utf-8')
print(result.stdout)
print(result.stderr)
"
Or with environment variables already set:
python "<anything-to-wechat_skill_dir>/scripts/publish_to_wechat.py" \
--file "<workspace>/wechat_article.html" \
--title "<article_title>" \
--cover "<workspace>/wechat_cover_compressed.jpg" \
--digest "<article_summary_under_120_chars>" \
--source-url "<original_url>"
Credentials: The script reads from WECHAT_APP_ID / WECHAT_APP_SECRET env vars. If not set, it prompts interactively.
Report success with Media ID and link to https://mp.weixin.qq.com/.
Tell the user: "文章已发送到你的微信公众号草稿箱,请登录微信公众平台审核后一键发布。"
Include:
| Site | Scraping Method | Notes |
|---|---|---|
| WeChat articles (mp.weixin.qq.com) | scrape_web.py | Anti-scraping: may need WebFetch fallback |
| Toutiao / 今日头条 | scrape_web.py | JS-heavy, may need WebFetch |
| Zhihu / 知乎 | scrape_web.py | Login wall for some content |
| CSDN | scrape_web.py | Works well |
| Juejin / 掘金 | scrape_web.py | Works well |
| Medium | scrape_web.py | Works well |
| News sites (generic) | scrape_web.py | Auto-detects article content |
| JS-rendered SPAs | WebFetch | Use browser rendering fallback |
| Error | Action |
|---|---|
| Page returns 403/404 | Try WebFetch; if blocked, inform user |
| Content too short (<200 chars) | Page may be JS-rendered, try WebFetch |
| Chinese characters garbled | scrape_web.py auto-detects encoding |
| Cover image > 64KB | Run compress_image.py; regenerate if needed |
| Images not loading in WeChat | publish_to_wechat.py auto-uploads to WeChat CDN |
| WeChat credentials missing | Script prompts interactively |
| IP not in whitelist | Show IP from error, guide user to mp.weixin.qq.com |
| WebFetch returns empty | Page has strong anti-scraping, inform user |
| Content copyrighted | Add disclaimer, keep source attribution |
IMPORTANT: When republishing web content:
--source-url when publishing to add a "Read More" link| Script | Purpose |
|---|---|
scripts/scrape_web.py | Web scraping → clean Markdown (supports 10+ site types) |
scripts/compress_image.py | Image compression (target 64KB for WeChat cover) |
file-to-wechat/scripts/md_to_wechat_html.py | Markdown → WeChat inline HTML |
anything-to-wechat/scripts/publish_to_wechat.py | WeChat draft box publishing |
| Variable | Required | Description |
|---|---|---|
WECHAT_APP_ID | Yes | WeChat Official Account AppID (or prompted interactively) |
WECHAT_APP_SECRET | Yes | WeChat Official Account AppSecret (or prompted interactively) |