Wechat Look

v1.0.1

读取微信公众号文章的专用工具,支持OCR文字识别。自动规范化URL并提取文章内容,识别图片中的中英文文字。

0· 130·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for 2720480371/wechat-look.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "Wechat Look" (2720480371/wechat-look) from ClawHub.
Skill page: https://clawhub.ai/2720480371/wechat-look
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install wechat-look

ClawHub CLI

Package manager switcher

npx clawhub@latest install wechat-look
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description (WeChat article extraction + OCR) match the provided Python and Node.js code: URL normalization, HTML extraction, downloading images, and invoking local Tesseract.js-based OCR. No unrelated credentials, binaries, or config paths are requested.
Instruction Scope
Runtime instructions and SKILL.md are focused on fetching WeChat article HTML, extracting image URLs, and running local Node.js OCR subprocesses — all within the stated purpose. Note: the runtime will perform network requests (fetch article HTML and download any images referenced in the page), and will spawn Node processes. Those are expected for this skill but are relevant privacy/security considerations (see guidance).
Install Mechanism
There is no platform install spec, but SKILL.md directs the user to run 'cd ocr_node && npm install'. The package.json/package-lock pull tesseract.js and node-fetch from the npm registry. This is a standard approach (moderate risk): npm packages can run install scripts and bring transitive dependencies, so verify packages before installing. No arbitrary URL downloads or unknown hosts are used in install steps.
Credentials
The skill requests no environment variables or credentials and the code does not read any secrets. The network access it needs (HTTP requests to WeChat pages and to image URLs) is proportional to OCR/extraction functionality.
Persistence & Privilege
Skill flags are default (not always:true). It does not request persistent system-level privileges or attempt to change other skills' configs. It spawns subprocesses and installs node modules in its own directory (normal for this architecture).
Assessment
This skill appears coherent for reading WeChat articles and performing OCR, but before installing consider: (1) It will fetch article HTML and download every image referenced in the page — those image URLs are external and may reveal that you requested that page (privacy/fingerprinting). (2) The Node layer requires running 'npm install' which will install tesseract.js and other npm packages; npm packages can execute install scripts, so inspect package.json/package-lock and prefer installing in an isolated environment (container or sandbox) if you are cautious. (3) tesseract.js may attempt to fetch language model / WASM assets at runtime depending on configuration; this implies additional network access beyond image downloads. (4) The skill spawns local Node processes (subprocess) to perform OCR; confirm your environment policy allows executing the bundled scripts. If these behaviors are acceptable, the skill's footprint is proportionate to its purpose. If you need to limit exposure, run it in an isolated runtime, review the node modules, or modify the OCR scripts to use pre-downloaded language models.

Like a lobster shell, security has layers — review code before you run it.

latestvk97fv8v7t849qeb718j2y84fah84gz5k
130downloads
0stars
2versions
Updated 2w ago
v1.0.1
MIT-0

WeChat Look OCR - 微信文章阅读工具(支持OCR)

📋 安装要求

系统要求

  • Python: 3.7 或更高版本
  • Node.js: 18 或更高版本
  • npm: Node.js 包管理器

安装步骤

  1. 安装系统依赖

    # Ubuntu/Debian
    sudo apt install python3 nodejs npm
    
    # macOS
    brew install python node
    
    # Windows
    # 从 https://nodejs.org/ 下载 Node.js
    # 从 https://python.org/ 下载 Python
    
  2. 安装技能

    openclaw skill install wechat-look-ocr
    
  3. 安装 Node.js 依赖

    cd ~/.openclaw/skills/wechat-look-ocr/ocr_node
    npm install
    

功能特性

  • 自动URL规范化: 自动添加?scene=1参数绕过验证码
  • 内容提取: 从HTML中提取纯文本内容
  • OCR文字识别: 自动识别图片中的中英文文字
  • 中英文支持: 支持中文简体和英文OCR识别
  • 智能回退: 中英文识别失败时自动回退到英文
  • 错误处理: 友好的错误提示和重试机制
  • 安全合规: 遵守OpenClaw安全规范,标记外部内容为未信任源

使用方法

在OpenClaw中直接使用:

读取微信文章 https://mp.weixin.qq.com/s/xxx

URL规范化规则

  • 无查询参数 → 添加 ?scene=1
  • 有查询参数 → 确保包含 scene=1 (覆盖重复参数)

🔧 技术实现

系统架构

该技能采用 Python + Node.js 混合架构:

  • Python 层:处理 URL 规范化、HTML 内容提取、图片 URL 提取
  • Node.js 层:运行 OCR 识别(使用 Tesseract.js)
  • 通信方式:Python 通过 subprocess 启动 Node.js 脚本

实现原理

  1. URL检测: 检查是否为微信文章URL
  2. 参数规范化: 添加或更新scene=1参数
  3. 内容获取: 使用 requests 库获取页面内容
  4. 文本提取: 从HTML中提取纯文本内容
  5. 图片处理: 提取所有图片URL
  6. OCR识别: 启动 Node.js 子进程进行文字识别
  7. 结果整合: 合并文本内容和OCR结果
  8. 结果返回: 提供结构化响应

📦 依赖详情

Python 依赖

  • requests - HTTP 请求库
  • subprocess - 启动 Node.js 进程
  • json - JSON 数据处理
  • pathlib - 路径操作
  • re - 正则表达式

Node.js 依赖

  • tesseract.js - OCR 识别引擎
  • node-fetch - HTTP 请求库

运行时行为

该技能在运行时会:

  1. 向微信服务器发送 HTTP 请求获取文章页面
  2. 下载文章中的图片用于 OCR 识别
  3. 启动本地 Node.js 进程进行文字识别
  4. 返回整合的文本和 OCR 结果

示例输出

{
  "title": "文章标题",
  "author": "作者名",
  "text_content": "提取的正文内容",
  "image_count": 5,
  "ocr_text": "[图片1] 识别的文字内容...",
  "url": "规范化后的URL",
  "status": "success"
}

注意事项

  • 仅支持微信公众号文章链接
  • 遵守微信访问频率限制
  • 外部内容标记为未信任源
  • 如遇到验证码问题,请确保URL正确包含scene=1

Comments

Loading comments...