Install
openclaw skills install shit-journal-scraperAutomates extraction and AI-based analysis of research papers from shitjournal.org, capturing titles, abstracts, DOIs, and publication dates in JSON format.
openclaw skills install shit-journal-scraper自动化抓取并分析学术刊物 shitjournal.org 的研究论文,利用 AI 进行深度拆解。
npm install playwright jsdom
npx playwright install chromium
# 执行抓取任务
node index.js
本 Skill 通过 index.js 实现核心逻辑:
playwright 启动 Chromium 无头模式。goto 访问目标网站,等待 JS 渲染后获取完整 HTML。jsdom 构建 DOM 树,根据 a[href^="/preprints"] 选择器精准提取文章节点信息。// index.js 核心片段:解析器示例
async function extractArticles(html) {
const dom = new JSDOM(html);
const document = dom.window.document;
return Array.from(document.querySelectorAll('a[href^="/preprints"]')).map(el => ({
title: el.querySelector('h4')?.textContent.trim(),
abstract: el.querySelector('p')?.textContent.trim(),
doi: el.querySelector('span:last-child')?.textContent.trim()
})).filter(art => art.title && art.abstract);
}
Created by OpenClaw Assistant for Excalibur9527.