WeChat Article Summarize

Read one or more WeChat public account article links from mp.weixin.qq.com, extract cleaned full text and optional image links, summarize each article in Chi...

MIT-0 · Free to use, modify, and redistribute. No attribution required.

⭐ 0 · 35 · 0 current installs · 0 all-time installs

byJuneLiu@JuneLiu1999

MIT-0

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Benign

high confidence

✓

Purpose & Capability

The name/description match the included scripts: reading mp.weixin.qq.com pages, extracting body/title/images, cleaning text, invoking a 'summarize' CLI to produce Chinese summaries, and writing markdown. Required capabilities (HTTP fetch, file I/O, calling an external summarizer) are consistent with the stated purpose.

✓

Instruction Scope

SKILL.md instructions and the orchestrator script (run_wechat_mindmap_workflow.py) limit actions to fetching specified WeChat URLs, repairing HTML, invoking the summarizer, normalizing text, and writing files to a user-chosen directory. The code does not read unrelated system paths or attempt to exfiltrate data to hidden endpoints; it only extracts image URLs (but does not download them).

✓

Install Mechanism

There is no install spec; the skill is instruction+script based and relies on local Python and an external 'summarize' CLI. No network download/install of arbitrary code is embedded in the skill files.

ℹ

Credentials

The skill declares no required env vars. It supports passing an --env-file to load environment variables into the process (summarize_cn.py implements load_env_file). This is reasonable to supply an API key for the external summarizer, but users should avoid passing env files containing unrelated/privileged secrets because the workflow will import those values into child processes.

✓

Persistence & Privilege

always is false and the skill does not attempt to modify other skills or system-wide agent settings. It writes output files only to user-specified or local working directories.

Assessment

This skill appears coherent and does what it says: it fetches pages from mp.weixin.qq.com, cleans text, calls a local/external 'summarize' CLI to produce Chinese summaries, and writes markdown to a directory you choose. Before installing or running: 1) Confirm you trust the 'summarize' CLI referenced by the scripts (inspect that binary or package — it will receive your article text and any env vars you provide). 2) Do not pass an env-file containing unrelated secrets (AWS keys, tokens) — the skill will load any variables in the provided env-file into the subprocess environment. 3) Expect the skill to make outbound HTTP requests to the WeChat article URLs (it sets a browser-like User-Agent); it extracts image URLs but does not download image files by default. 4) Run the workflow in a directory you control and review the generated files before sharing. If you want extra assurance, run the scripts locally on a test article and inspect the called subprocess (summarize) behavior and environment variables it receives.

Like a lobster shell, security has layers — review code before you run it.

Current versionv0.1.1

Download zip

latestvk97anqrg10fxs87baybc74j9xh8301jw

License

MIT-0

Free to use, modify, and redistribute. No attribution required.

Termshttps://spdx.org/licenses/MIT-0.html

SKILL.md

WeChat Article Summarize

把一个或多个微信公众号文章链接整理成结构化 markdown，支持单篇整理和多篇日报汇总。

功能简介

读取一个或多个 mp.weixin.qq.com 文章链接
抽取文章正文、标题、发布时间，以及可选的图片链接
自动修复常见的微信正文乱码问题
调用 summarize 用中文总结全文内容
生成结构化 markdown 文件
- 单篇文章整理
- 多篇文章汇总 / 日报
支持按日期 + 标题，或日期 + 篇数 + 汇总说明命名
支持把文件保存到用户指定目录

使用前需要确认

在真正开始抓取文章之前，需要先确认：

summarize 已经配置好 API key，并且可正常使用
是否需要在最终 markdown 中保留图片链接
最终文件保存到哪个目录

适用场景

总结单篇微信文章
把多篇微信文章汇总成一份日报
输出适合继续阅读、归档或二次整理的 markdown 文件

Workflow

Step 0: Confirm prerequisites before fetching anything

Do not fetch article content until all three items are clear:

summarize is ready
- Ask the user to configure summarize API access first if needed.
- Verify summarize by running a tiny Chinese test.
- Proceed only if summarize returns a usable summary.
Image preference
- Ask whether the final markdown should include image links.
- Map user intent to include_images=true|false.
Output directory
- Ask where to save the final markdown file.
- If the user says “下载文件夹”, use ~/Downloads.
- Create the target directory if it does not exist.

If any of the three items is missing, stop and ask before continuing.

Step 1: Extract each WeChat article

For each mp.weixin.qq.com URL, run:

python3 scripts/read_wechat_article.py '<wechat_url>' --out '<temp_dir>'

This produces structured metadata, raw HTML, and a first-pass markdown export.

Step 2: Clean the body text

Do not trust the first-pass article markdown blindly.

If the body contains mojibake or obvious encoding corruption, repair it from raw.html by running:

python3 scripts/fix_wechat_body.py '<raw.html>' --out '<body-fixed.txt>'

Use the cleaned body text as the canonical input for summarization.

Step 3: Summarize in Chinese

Always summarize the cleaned local text, not the original WeChat URL.

Run:

python3 scripts/summarize_cn.py '<body-fixed.txt>' --out '<summary.json>' --length short

or for a combined report:

python3 scripts/summarize_cn.py '<combined-input.md>' --out '<summary.json>' --length medium

The script enforces Chinese output and fails if the returned summary is not sufficiently Chinese.

Step 4: Normalize summary text before writing markdown

Never write summarize output directly into the final file.

Normalize paragraph breaks and spacing with:

python3 scripts/normalize_markdown_text.py '<input.txt>' --out '<normalized.txt>'

Use this for:

each single-article summary
the combined daily-report overview

This prevents ugly line wrapping and mixed-language formatting artifacts.

Step 5: Build the final markdown

Single article

Run:

python3 scripts/build_mindmap_markdown.py \
  --result '<result.json>' \
  --body '<body-fixed.txt>' \
  --summary '<summary.json>' \
  --output-dir '<chosen-dir>' \
  --include-images true

Multiple articles / daily report

Run:

python3 scripts/build_batch_report.py \
  --inputs '<dir1>' '<dir2>' '<dir3>' \
  --output-dir '<chosen-dir>' \
  --include-images true \
  --report-label '微信文章日报'

The batch report must:

summarize all articles individually
summarize the full set as one combined overview
place the combined overview first
then append each single article section

Output rules

Naming

Single article

YYYYMMDD-文章标题.md

Multiple articles

YYYYMMDD-<总文章数量>篇-<汇总说明>.md

Content rules

Single article output should contain

title
source URL
publish time
summarize-generated Chinese summary
mindmap-style structure
optional image section

Batch report output should contain

combined daily overview at the top
combined mindmap
per-article title, URL, date, summary, and mindmap
optional image overview

Non-negotiable quality gates

Before writing the final markdown:

Summary language check
- If the summary is not mainly Chinese, retry or fail.
Paragraph normalization
- Collapse unnatural line breaks inside prose.
- Keep markdown headings and bullet lists intact.
Clean body source
- Prefer repaired text from raw.html when the extracted body is corrupted.

Bundled scripts

scripts/read_wechat_article.py — fetch WeChat article metadata, body, raw HTML, and image links
scripts/fix_wechat_body.py — repair mojibake and extract clean text from raw HTML
scripts/summarize_cn.py — run summarize in Chinese and enforce a language check
scripts/normalize_markdown_text.py — normalize prose paragraphs and line breaks
scripts/build_mindmap_markdown.py — generate single-article markdown files
scripts/build_batch_report.py — generate multi-article combined reports
scripts/run_wechat_mindmap_workflow.py — orchestrate the full workflow end to end after the required user confirmations

Files

9 total

Select a file

Select a file to preview.

Comments

Loading comments…