Young Post Downloader

v1.0.0

自动下载 Young Post Club Spark 网站文章并生成 PDF。使用场景：用户要求下载 Today's Read 文章、获取 Young Post 内容、批量抓取教育文章、生成学习材料 PDF、上传飞书云空间。支持：自动解析文章列表、批量下载、HTML 排版、PDF 转换、飞书上传。

⭐ 0· 109·0 current·0 all-time

by@cgxxxxxxxxxxxx

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for cgxxxxxxxxxxxx/young-post-downloader.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "Young Post Downloader" (cgxxxxxxxxxxxx/young-post-downloader) from ClawHub.
Skill page: https://clawhub.ai/cgxxxxxxxxxxxx/young-post-downloader
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install young-post-downloader

ClawHub CLI

Package manager switcher

npx clawhub@latest install young-post-downloader

Security Scan

Capability signals

Requires OAuth token

These labels describe what authority the skill may exercise. They are separate from suspicious or malicious moderation verdicts.

VirusTotal

Suspicious

View report →

OpenClaw

Suspicious

medium confidence

ℹ

Purpose & Capability

Name/description, SKILL.md, and the two included scripts are coherent: they fetch Young Post Spark pages, parse article links, produce HTML/PDF, and upload to Feishu. The skill delegates web fetching and Feishu uploads to other platform skills (web_fetch, feishu_drive_file), which is expected. Minor issue: the skill relies on a local Chrome binary for PDF conversion and a workspace path (/home/admin/.openclaw/workspace) but does not declare the chrome binary or explain required Feishu credentials in the registry metadata.

✓

Instruction Scope

SKILL.md and the scripts restrict actions to the declared workflow: fetch Spark page(s), extract article links, download content, create HTML, convert to PDF, upload to Feishu. They do not instruct reading unrelated system files or contacting unknown endpoints. The web_fetch/feishu calls imply network I/O and uploading user data to Feishu, which is appropriate for the stated purpose.

✓

Install Mechanism

There is no install spec (instruction-only plus scripts bundled). No external downloads or archive extraction are used. This lowers installation risk; nothing is pulled from arbitrary URLs.

Credentials

The skill uses google-chrome to print to PDF and expects Feishu upload capability, but the registry metadata lists no required binaries or credentials. Specifically: (1) SKILL.md and the code call 'google-chrome' but 'required binaries' is empty — the runtime may fail if Chrome is absent. (2) The skill uploads to Feishu (feishu_drive_file) but declares no Feishu credentials; it appears to rely on another platform skill for auth, which is plausible, but you should confirm that Feishu authentication will be handled by the agent environment and that no undeclared secrets are needed. (3) It writes files to a default workspace (/home/admin/.openclaw/workspace) — verify this path is acceptable and writable and that it won't overwrite important data.

✓

Persistence & Privilege

The skill is not marked 'always: true' and does not request elevated or persistent system privileges. It does not modify other skills' configurations. Autonomous invocation is allowed (platform default) but that is expected for this kind of skill.

What to consider before installing

This skill appears to do what it says (download Spark articles, make HTML/PDF, upload to Feishu), but a few practical details are missing: 1) Confirm the agent environment provides the web_fetch and feishu_drive_file capabilities (those will handle network requests and Feishu authentication). 2) Ensure google-chrome (headless) is installed on the host — the skill runs 'google-chrome' but the registry didn't list any required binaries. 3) Check where files will be written (WORKSPACE defaults to /home/admin/.openclaw/workspace) and that you’re comfortable with that path and its permissions. 4) Verify Feishu credentials/permissions are available to whatever platform skill performs uploads; the skill itself does not declare or manage OAuth tokens. 5) Respect copyright and rate limits when scraping. If you need higher assurance, ask the author to (a) declare required binaries and credentials in the registry metadata, and (b) replace os.system calls with explicit, audited subprocess calls or platform-provided PDF conversion APIs.

Like a lobster shell, security has layers — review code before you run it.

latestvk97a375srd35prz8fqc8kyxv6s84bv6j

109downloads

0stars

1versions

Updated 3w ago

v1.0.0

MIT-0

Young Post Club Spark 文章下载器

自动化下载 Young Post Club Spark 网站的「Today's Read」文章，生成精美排版的 PDF 文档并上传到飞书云空间。

快速开始

当用户要求：

「下载 Young Post Spark 的文章」
「把 Today's Read 转成 PDF」
「获取这个网站的文章并上传飞书」
「批量下载教育文章」

触发此技能。

工作流程

1. 访问 Spark 页面 → 2. 解析文章列表 → 3. 批量下载 → 4. 生成 HTML → 5. 转换 PDF → 6. 上传飞书

使用脚本

完整流程

python scripts/download_articles.py

自定义参数

python scripts/download_articles.py \
  --url "https://www.youngpostclub.com/spark" \
  --output-dir "/path/to/workspace" \
  --upload-to-feishu true

详细步骤

步骤 1: 访问 Spark 页面

使用 web_fetch 工具获取 Spark 主页内容：

web_fetch(url="https://www.youngpostclub.com/spark", extractMode="markdown")

步骤 2: 解析文章列表

从页面提取所有文章链接，识别模式：

/yp/news/... - 新闻类
/spark/learning-zone/... - 学习区
/yp/being-well/... - 健康生活
/spark/share-us/... - 读者投稿

步骤 3: 批量下载文章

对每篇文章调用 web_fetch：

for article_url in article_urls:
    content = web_fetch(url=article_url, extractMode="markdown")
    articles.append({
        "title": content["title"],
        "content": content["text"],
        "url": article_url,
        "date": extract_date(content)
    })

步骤 4: 生成 HTML 文档

创建带目录、样式排版的 HTML：

添加可点击目录
按类别分组文章
保留原文引用
添加日期和分类标签

输出路径：{workspace}/young-post-spark-{date}.html

步骤 5: 转换为 PDF

使用 Chrome 无头模式：

google-chrome --headless --disable-gpu \
  --print-to-pdf={output}.pdf \
  --print-to-pdf-no-header \
  --paper-size=A4 \
  {input}.html

步骤 6: 上传飞书

使用 feishu_drive_file 上传：

feishu_drive_file(
    action="upload",
    file_path=pdf_path,
    file_name=f"Young Post Spark 文章合集 - {date}.pdf",
    size=file_size
)

输出文件

文件	说明	大小
`young-post-spark-{date}.html`	HTML 版本，可用 Word 打开	~40 KB
`young-post-spark-{date}.pdf`	PDF 版本，A4 打印优化	~300 KB

错误处理

常见问题

网页加载失败
- 检查网络连接
- 重试 web_fetch
- 使用备用 URL
PDF 转换失败
- 确认 Chrome 已安装：which google-chrome
- 检查 HTML 语法
- 增加超时时间
飞书上传失败
- 检查 OAuth 授权
- 确认文件大小限制 (<100MB)
- 验证文件名格式

自定义配置

修改输出目录

编辑脚本中的 WORKSPACE 变量：

WORKSPACE = "/your/custom/path"

添加文章过滤

只下载特定类别的文章：

CATEGORIES = ["news", "learning-zone"]  # 只下载这些类别

调整 PDF 样式

修改 HTML 中的 CSS：

body {
    font-family: "Calibri", "Arial", sans-serif;
    font-size: 11pt;
    line-height: 1.6;
}

使用示例

示例 1: 下载今日文章

用户：「下载 Young Post Spark 今天的文章」

1. 访问 spark 页面
2. 解析 17 篇文章
3. 批量下载
4. 生成 HTML + PDF
5. 上传飞书
→ 回复：✅ 已下载 17 篇文章并上传到飞书云空间

示例 2: 只要 PDF

用户：「把 Young Post 的文章转成 PDF」

跳过 HTML 预览，直接生成 PDF
→ 回复：📄 PDF 已生成并上传

示例 3: 指定类别

用户：「只下载学习区的文章」

过滤 URL 包含 learning-zone 的文章
→ 回复：📚 已下载 5 篇学习区文章

注意事项

版权说明: 在生成的文档中添加版权声明
使用限制: 仅供个人学习使用
频率限制: 避免短时间内大量请求
内容审核: 确保内容适合目标受众

更新日志

v1.0 (2026-04-05): 初始版本，支持完整流程

Comments

Loading comments...