Journal Deep Intel Extractor

v1.0.0

专业的学术情报提取工具。支持 Nature/Science/Cell 等全球主流期刊,自动化抓取过去 N 天内新增的 Article 或 Review,并深度提取 PMID 与 Abstract 全文,为 AI 科普总结提供核心数据源。

0· 35·0 current·0 all-time
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description claim to collect PMIDs and abstracts from major journals; the code queries PubMed and visits PubMed detail pages to extract abstracts and titles. Requested resources (none) match the task. Minor wording mismatch: the README language suggests “deep access” and may be interpreted as retrieving full-text articles, but the implementation only fetches PubMed pages/abstracts.
Instruction Scope
SKILL.md instructs running the Python script with journal/type/days arguments. The runtime behavior is limited to HTTP GETs to pubmed.ncbi.nlm.nih.gov, HTML parsing, and writing a JSON file to ~/Documents/Journal_Intel. The script does not read other files, environment vars, or contact third-party endpoints beyond PubMed.
Install Mechanism
There is no install spec; the skill is instruction-only but includes requirements.txt and a script. Dependencies (requests, beautifulsoup4, lxml) are reasonable for the task. The SKILL.md entry references venv/bin/python3 but no virtualenv creation step is provided — this is an operational mismatch (not a security issue) you should be aware of.
Credentials
The skill requires no environment variables, no credentials, and does not request unrelated secrets. Network access is only used for PubMed; User-Agent header is hard-coded in the script.
Persistence & Privilege
The skill writes output files under the user's home Documents folder (~/Documents/Journal_Intel). It is not always-enabled and does not modify other skills or system configuration. Autonomous invocation is allowed (platform default); combined with file writes, consider whether you want the agent to run this without manual review.
Assessment
This skill appears to do what it says: it scrapes PubMed search results and article pages for PMIDs, titles, and abstracts and saves them as JSON in ~/Documents/Journal_Intel/. Before installing or running, consider: (1) The description's phrase “deep access / 全文” may imply retrieving paywalled full text, but this script only fetches PubMed pages/abstracts — if you expect full articles you'll need different code or credentials. (2) Respect PubMed/NLM terms of use and robots.txt; if you plan frequent runs increase the delay or use official APIs (e.g., Entrez E-utilities) to avoid throttling. (3) The SKILL.md assumes a virtualenv (venv/bin/python3) but no install step is provided; you should create a virtualenv and pip install -r requirements.txt before running. (4) The script writes files to your Documents folder — confirm you’re comfortable with that path and disk use, or modify the save location. (5) No credentials or external endpoints are requested, and the code does not exfiltrate data beyond contacting PubMed. If you need stronger guarantees, inspect/modify the script to use Entrez APIs (with an API key) and add explicit error handling and rate limiting.

Like a lobster shell, security has layers — review code before you run it.

latestvk97e5et986z4yx2xpzf3ra62tn8403ex

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Comments