Install
openclaw skills install liuzln-openclaw-skills-wechat-article-fetcherFetch and save WeChat Official Account articles with full content and images. Supports any WeChat article URL, automatic image download, JSON export, and ful...
openclaw skills install liuzln-openclaw-skills-wechat-article-fetcher微信公众号文章爬取 Skill,支持任意微信公众号文章 URL,自动下载图片,导出 JSON,保存完整截图。
# 爬取单篇文章
python3 skills/wechat-article-fetcher/scripts/fetch.py \
https://mp.weixin.qq.com/s/HTGvy5C6SYyr5XQhTfTfzw
# 指定输出目录
python3 skills/wechat-article-fetcher/scripts/fetch.py \
https://mp.weixin.qq.com/s/HTGvy5C6SYyr5XQhTfTfzw \
--output ./my_articles
# 不保存图片
python3 skills/wechat-article-fetcher/scripts/fetch.py \
https://mp.weixin.qq.com/s/HTGvy5C6SYyr5XQhTfTfzw \
--no-images
# 方式 1: 激活虚拟环境后运行
source playwright-env/bin/activate
python3 skills/wechat-article-fetcher/scripts/fetch.py <url>
# 方式 2: 使用提供的包装脚本
python3 skills/wechat-article-fetcher/scripts/run_in_venv.py \
<url> --venv /path/to/playwright-env
from wechat_article_fetcher import fetch_article
# 爬取文章
result = fetch_article("https://mp.weixin.qq.com/s/HTGvy5C6SYyr5XQhTfTfzw")
print(f"标题: {result['title']}")
print(f"作者: {result['author']}")
print(f"内容长度: {result['length']}")
print(f"图片数量: {result['images_count']}")
python3 skills/wechat-article-fetcher/scripts/fetch.py [OPTIONS] URL
参数:
URL 微信公众号文章 URL(必需)
选项:
-o, --output PATH 输出目录(默认: ./wechat_articles)
--no-images 不保存图片
--no-screenshot 不保存截图
--headless BOOLEAN 无头模式(默认: true)
--timeout INTEGER 超时时间(毫秒,默认: 60000)
-h, --help 显示帮助信息
# 从文件中读取 URL 列表
python3 skills/wechat-article-fetcher/scripts/batch_fetch.py \
--urls-file urls.txt
# 或者直接在命令行指定多个 URL
python3 skills/wechat-article-fetcher/scripts/batch_fetch.py \
--urls "url1" "url2" "url3"
python3 skills/wechat-article-fetcher/scripts/run_in_venv.py \
<url> --venv /path/to/playwright-env
每次爬取会创建一个以时间戳命名的目录:
wechat_articles/
└── 20260302_125500/
├── article.json # JSON 格式结果
├── article.png # 完整页面截图
└── images/ # 文章图片目录
├── 001_image1.jpg
├── 002_image2.png
└── ...
{
"title": "文章标题",
"author": "作者名称",
"publish_date": "2026-03-02",
"url": "https://mp.weixin.qq.com/s/...",
"content": "完整文章内容...",
"images": [
{
"index": 1,
"url": "https://mmbiz.qpic.cn/...",
"alt": "图片描述",
"filename": "001_image.jpg",
"success": true
}
],
"images_count": 5,
"images_dir": "wechat_articles/20260302_125500/images",
"fetch_time": "2026-03-02T12:55:00.000000",
"length": 15000
}
可以创建 config.json 来自定义默认配置:
{
"headless": true,
"timeout": 60000,
"output_dir": "./wechat_articles",
"save_images": true,
"save_screenshot": true,
"user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36..."
}
python3 skills/wechat-article-fetcher/scripts/fetch.py \
https://mp.weixin.qq.com/s/HTGvy5C6SYyr5XQhTfTfzw
创建 urls.txt:
https://mp.weixin.qq.com/s/xxx
https://mp.weixin.qq.com/s/yyy
https://mp.weixin.qq.com/s/zzz
然后运行:
python3 skills/wechat-article-fetcher/scripts/batch_fetch.py \
--urls-file urls.txt
python3 skills/wechat-article-fetcher/scripts/run_in_venv.py \
https://mp.weixin.qq.com/s/HTGvy5C6SYyr5XQhTfTfzw \
--venv /home/user/playwright-env
解决方案: 确保在正确的虚拟环境中运行,或安装依赖:
pip install playwright
playwright install chromium
解决方案: 安装系统依赖:
sudo apt install -y libnss3 libnspr4 libatk1.0-0 libatk-bridge2.0-0 \
libcups2 libdrm2 libxkbcommon0 libxcomposite1 libxdamage1 \
libxfixes3 libxrandr2 libgbm1 libasound2
解决方案: 检查网络连接,或使用 --no-images 跳过图片下载。
解决方案: 增加超时时间:
python3 skills/wechat-article-fetcher/scripts/fetch.py \
<url> --timeout 90000