CNBLOGS 精华内容抓取

v1.0.0

抓取博客园精华区文章标题和正文,支持指定页数批量下载并保存为纯文本文件。

0· 55·0 current·0 all-time
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description (抓取博客园精华区文章) matches the SKILL.md: it describes using curl/grep/sed to fetch list pages, extract links, download articles, strip HTML and save them as text files. No unrelated services, credentials, or binaries are requested.
Instruction Scope
Instructions stay within the stated scraping purpose (download list pages, parse links, fetch article bodies, save to output dir). Caution: the skill prescribes HTML parsing via grep -oP/sed (regex-based parsing), which is brittle and may miss content or break on site changes. It will write files to the user's output directory (default under ~/.openclaw/workspace); ensure you don't point it at sensitive system directories. The SKILL.md indicates titles will be sanitized, but filename/overwrite handling is not fully specified.
Install Mechanism
Instruction-only skill with no install spec and no code files — nothing is downloaded or written to disk by an installer. This is the lowest-risk install model.
Credentials
The skill requests no environment variables or credentials. Declared runtime dependencies (curl, grep -oP, sed) are reasonable for command-line scraping. No unrelated secrets or config paths are asked for.
Persistence & Privilege
always is false and the skill does not request persistent/system-wide changes. It does not modify other skills or agent settings. Normal autonomous invocation is allowed (default) but not exceptional here.
Assessment
This is an instruction-only scraper that will run shell commands (curl, grep, sed) to download and save CNBlogs 'pick' articles to a directory you choose. It does not ask for credentials or install code. Before running: (1) test on a single page and a non-sensitive output directory to confirm behavior; (2) avoid pointing output-dir at system or home configuration folders to prevent accidental overwrite; (3) be aware parsing uses regex (grep -oP) which may be brittle or incompatible with some grep builds — commands may need adjustment on your system; (4) consider site Terms of Service and rate limits — scraping can be blocked or disallowed; (5) if you want stronger safety, inspect or run the commands in a sandboxed environment first.

Like a lobster shell, security has layers — review code before you run it.

latestvk97dha7ntrcab121w47g9vmw1d84dn7c

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Comments