Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

微信公众号内容提取工具

v1.0.0

Extract metadata and content from WeChat Official Account articles. Use when user needs to parse WeChat article URLs (mp.weixin.qq.com), extract article info...

0· 92·0 current·0 all-time
by雨飞@xls1994

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for xls1994/wechatarticle-extractor.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "微信公众号内容提取工具" (xls1994/wechatarticle-extractor) from ClawHub.
Skill page: https://clawhub.ai/xls1994/wechatarticle-extractor
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install wechatarticle-extractor

ClawHub CLI

Package manager switcher

npx clawhub@latest install wechatarticle-extractor
Security Scan
VirusTotalVirusTotal
Suspicious
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
The name/description and the main scripts (scripts/extract.js, bin/wechat-extract.js) match the stated purpose (fetch and parse mp.weixin.qq.com pages). However there are additional files (convert.js, run-extract.js) that read/write absolute user-specific filesystem paths and embed a concrete example URL; those files are not necessary for the core scraping capability and are unexpected.
!
Instruction Scope
SKILL.md and the CLI instruct only network fetch + parsing. But repository files reference local filesystem paths (e.g. convert.js reads /Users/canghe/.../tool-results/b97eb13.txt and writes to /Users/canghe/Downloads/..., run-extract.js writes to C:/Users/xsl/...), which are outside the stated scope and would access user data if executed. Also scripts/extract.js uses new Function to execute code snippets extracted from page <script> tags — this executes untrusted JS scraped from webpages.
Install Mechanism
No install spec is provided (instruction-only from platform perspective). Dependencies are standard npm libs declared in package.json (cheerio, request-promise, etc.). There is no external download or obscure installer.
!
Credentials
The skill does not request environment variables or credentials, which is appropriate. But the presence of hardcoded absolute paths and sample-run files that access user home directories is disproportionate to a simple extractor and could read or overwrite local files if those helper scripts are run.
Persistence & Privilege
The skill is not always-enabled and doesn't request special platform privileges. It writes output files when used (expected for a CLI tool). The concern is file writes to unexpected, hardcoded locations in some scripts rather than a generic current-directory output.
Scan Findings in Context
[use_of_new_Function_dynamic_code_execution] unexpected: scripts/extract.js constructs and runs new Function(...) on JavaScript extracted from page <script> tags to recover variables. This is sometimes used to parse embedded script data but effectively executes untrusted code from remote pages and is high-risk unless carefully sandboxed or strictly validated.
[hardcoded_user_paths] unexpected: convert.js and run-extract.js contain absolute paths pointing to specific user home directories (/Users/canghe/... and C:/Users/xsl/...). These are not required by the documented CLI usage and could read or overwrite user files if executed.
[http_request_to_target_hosts] expected: scripts/extract.js performs HTTP GET requests to mp.weixin.qq.com and weixin.sogou.com using request-promise — this is expected for a web scraper.
What to consider before installing
This skill mostly matches its description (it fetches and parses WeChat article pages), but take these precautions before installing or running it: - Inspect or remove helper scripts: do not run convert.js or run-extract.js unmodified — they contain hardcoded absolute paths that will read/write files in specific user home directories. Those files look like developer convenience scripts and are not needed for normal CLI usage. - Beware dynamic execution: scripts/extract.js uses new Function(...) to evaluate JavaScript taken from page <script> blocks. That can execute arbitrary code from the scraped page. Only run this tool on unprivileged hosts or inside a sandbox/container, and avoid feeding it URLs from untrusted sources. - Run in an isolated environment (VM, container) and avoid running as an administrator/root user. Review which files will be written and consider changing output paths to a safe directory. - If you need higher assurance, ask the maintainer whether the new Function usage is strictly limited to parsing static assignment expressions (and for a code comment or test showing sanitization), and request removal or disabling of developer scripts with hardcoded paths. Given these issues, treat the skill as suspicious rather than outright malicious; it may be benign developer leftovers, but it includes risky behaviors you should address before use.
scripts/extract.js:206
Dynamic code execution detected.
Patterns worth reviewing
These patterns may indicate risky behavior. Check the VirusTotal and OpenClaw results above for context-aware analysis before installing.

Like a lobster shell, security has layers — review code before you run it.

latestvk977pb4cyz3w9h0m9t0vc8wvm9843txj
92downloads
0stars
1versions
Updated 3w ago
v1.0.0
MIT-0

WeChat Article Extractor

Extract metadata and content from WeChat Official Account (微信公众号) articles.

Quick Start 🚀

Command Line Usage (Recommended)

The easiest way to use this skill is via the CLI command:

# Basic usage - extracts and saves as markdown
npx wechat-article-extractor https://mp.weixin.qq.com/s/xxx

# Specify output path
npx wechat-article-extractor https://mp.weixin.qq.com/s/xxx --output ./articles/post.md

# Output JSON format
npx wechat-article-extractor https://mp.weixin.qq.com/s/xxx --json

# Show help
npx wechat-article-extractor --help

Programmatic Usage

const { extract } = require('./scripts/extract.js');

const result = await extract('https://mp.weixin.qq.com/s?__biz=...');
if (result.done) {
  console.log(result.data.msg_title);
  console.log(result.data.msg_content);
}

Capabilities

  • Parse WeChat article URLs (mp.weixin.qq.com)
  • Extract article metadata: title, author, description, publish time
  • Extract account info: name, avatar, alias, description
  • Get article content (HTML)
  • Get cover image URL
  • Support multiple article types: post, video, image, voice, text, repost
  • Handle various error cases: deleted content, expired links, access limits

CLI Options

OptionDescriptionDefault
<URL>WeChat article URLRequired
--output <path>Output file path./wechat-article.md
--format <format>Output format (markdown|json|html)markdown
--jsonOutput JSON formatfalse
-h, --helpShow help-

Examples

# Extract article and save to custom location
npx wechat-article-extractor https://mp.weixin.qq.com/s/xxx --output ./my-article.md

# Get JSON output for processing
npx wechat-article-extractor https://mp.weixin.qq.com/s/xxx --json > article.json

# From within the skill directory
npm run extract https://mp.weixin.qq.com/s/xxx

Usage in Scripts

Basic Extraction from URL

const { extract } = require('./scripts/extract.js');

const result = await extract('https://mp.weixin.qq.com/s?__biz=...');
// Returns: { done: true, code: 0, data: {...} }

Extraction from HTML

const html = await fetch(url).then(r => r.text());
const result = await extract(html, { url: sourceUrl });

Advanced Options

const result = await extract(url, {
  shouldReturnContent: true,      // Return HTML content (default: true)
  shouldReturnRawMeta: false,     // Return raw metadata (default: false)
  shouldFollowTransferLink: true, // Follow migrated account links (default: true)
  shouldExtractMpLinks: false,    // Extract embedded mp.weixin links (default: false)
  shouldExtractTags: false,       // Extract article tags (default: false)
  shouldExtractRepostMeta: false  // Extract repost source info (default: false)
});

Response Format

Success Response

{
  done: true,
  code: 0,
  data: {
    // Account info
    account_name: "公众号名称",
    account_alias: "微信号",
    account_avatar: "头像URL",
    account_description: "功能介绍",
    account_id: "原始ID",
    account_biz: "biz参数",
    account_biz_number: 1234567890,
    account_qr_code: "二维码URL",

    // Article info
    msg_title: "文章标题",
    msg_desc: "文章摘要",
    msg_content: "HTML内容",
    msg_cover: "封面图URL",
    msg_author: "作者",
    msg_type: "post", // post|video|image|voice|text|repost
    msg_has_copyright: true,
    msg_publish_time: Date,
    msg_publish_time_str: "2024/01/15 10:30:00",

    // Link params
    msg_link: "文章链接",
    msg_source_url: "阅读原文链接",
    msg_sn: "sn参数",
    msg_mid: 1234567890,
    msg_idx: 1
  }
}

Error Response

{
  done: false,
  code: 1001,
  msg: "无法获取文章信息"
}

Error Codes

CodeMessageDescription
1000文章获取失败General failure
1001无法获取文章信息Missing title or publish time
1002请求失败HTTP request failed
1003响应为空Empty response
1004访问过于频繁Rate limited
1005脚本解析失败Script parsing error
1006公众号已迁移Account migrated
2001请提供文章内容或链接Missing input
2002链接已过期Link expired
2003内容涉嫌侵权Content removed (copyright)
2004无法获取迁移后的链接Migration link failed
2005内容已被发布者删除Content deleted by author
2006内容因违规无法查看Content blocked
2007内容发送失败Failed to send
2008系统出错System error
2009不支持的链接Unsupported URL
2010内容获取失败Content fetch failed
2011涉嫌过度营销Marketing/spam content
2012账号已被屏蔽Account blocked
2013账号已自主注销Account deleted
2014内容被投诉Content reported
2015账号处于迁移流程中Account migrating
2016冒名侵权Impersonation

Dependencies

Required npm packages:

  • cheerio - HTML parsing
  • dayjs - Date formatting
  • request-promise - HTTP requests
  • qs - Query string parsing
  • lodash.unescape - HTML entities

Notes

  • Handles various WeChat page structures and anti-scraping measures
  • Automatically detects article type from page content
  • Supports extracting from Sogou WeChat search results (weixin.sogou.com)
  • Some fields may be null depending on article type and page structure

Troubleshooting

"MODULE_NOT_FOUND" error

Ensure you're running the command from within the skill directory or using npx:

cd C:\Users\xsl\.agents\skills\wechat-article-extractor
npm run extract <URL>

"访问过于频繁" error

Wait a few minutes before trying again. This is WeChat's rate limiting.

Link expired

If you see "链接已过期", the article has been removed by the author or platform.

Comments

Loading comments...