论文关键词搜索和自动下载到指定目录 Keyword search for papers and automatic download！

v1.0.1

Search and download related arXiv papers by topic plus date range, or from a seed paper title/id. Use when user asks to crawl related papers, collect arXiv a...

⭐ 1· 75·0 current·0 all-time

by@ppingzhang

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for ppingzhang/paper-search-and-download-automatically.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "论文关键词搜索和自动下载到指定目录 Keyword search for papers and automatic download！" (ppingzhang/paper-search-and-download-automatically) from ClawHub.
Skill page: https://clawhub.ai/ppingzhang/paper-search-and-download-automatically
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install paper-search-and-download-automatically

ClawHub CLI

Package manager switcher

npx clawhub@latest install paper-search-and-download-automatically

Security Scan

VirusTotal

Pending

View report →

OpenClaw

Benign

high confidence

✓

Purpose & Capability

The name/description (search and download arXiv papers) matches the included script and SKILL.md. The script constructs arXiv API queries, parses the Atom feed, and downloads PDFs—exactly what the skill claims to do. No unrelated services, binaries, or credentials are requested.

✓

Instruction Scope

SKILL.md instructs the agent to run the included Python script from the workspace root with clearly defined arguments and to report counts and output path. The script only reads its CLI args, performs HTTP(S) requests to arXiv endpoints, parses XML, and writes files to ./arxiv/. It does not attempt to read other files, environment variables, or external endpoints.

✓

Install Mechanism

No install specification is provided (instruction-only + an included script). The script uses only Python standard library modules; nothing is downloaded or executed during install. This is low-risk and proportionate for the functionality.

✓

Credentials

The skill requires no environment variables, credentials, or config paths. All network calls are to arXiv endpoints (export.arxiv.org and arxiv.org PDFs) which is appropriate for the described purpose.

✓

Persistence & Privilege

The skill does not request always:true and does not modify other skills or system-wide config. It writes PDFs to a local ./arxiv/ directory (as expected) and has normal autonomous-invocation defaults.

Assessment

This skill appears to do what it says: query the arXiv API and download PDFs into ./arxiv/. Before installing/using it, consider: (1) run it in a workspace where writing ./arxiv/ is acceptable and you have disk space; (2) avoid very large max-results values to respect arXiv rate limits and site policies; (3) review/confirm any seed-title you pass (title matching can be imprecise); (4) verify you trust the skill source since the repository owner is unknown even though the code is small and uses only the Python standard library; and (5) if you need corporate/network auditability, run it where network egress to arxiv.org is allowed and logged.

Like a lobster shell, security has layers — review code before you run it.

latestvk976zx73kkhjh1sswk84cyc2s984xwjw

75downloads

1stars

2versions

Updated 1w ago

v1.0.1

MIT-0

arXiv Related Papers Downloader

What this skill does

Accepts either:
- topic + time range, or
- seed paper (arXiv id or title)
Finds related papers from arXiv API.
Downloads PDF files into arxiv/.
Uses filename format: versionDate-title.pdf, where versionDate is vN_YYYYMMDD.

When to use

Use this skill when the user asks to:

crawl/search related papers by topic;
find related papers from one article;
download arXiv PDFs in batch;
save with a deterministic naming rule.

Required user input

The user must provide one of these modes:

Topic mode
- topic
- start date (YYYY-MM-DD)
- end date (YYYY-MM-DD)
Seed paper mode
- seed arXiv id (preferred) or seed title
- optional start date / end date

Optional:

max results (default: 20)

Execution steps

Confirm missing parameters with the user.
Run the script from workspace root:

# Topic mode
python ./scripts/download_arxiv.py \
  --topic "graph neural network" \
  --start-date 2024-01-01 \
  --end-date 2024-12-31 \
  --max-results 20

# Seed mode by arXiv id
python ./scripts/download_arxiv.py \
  --seed-id "2401.12345v1" \
  --max-results 20

# Seed mode by title
python ./scripts/download_arxiv.py \
  --seed-title "Attention Is All You Need" \
  --start-date 2018-01-01 \
  --end-date 2024-12-31 \
  --max-results 20

Report back:
- how many papers were found;
- how many PDFs were downloaded;
- the output directory path.

Output location and naming

Output dir: ./arxiv/ (auto-created if missing)
File naming rule:
- v1_20240213-Your_Paper_Title.pdf
- v3_20231105-Your_Paper_Title.pdf

Notes

HTTPS uses normal TLS verification only (no insecure certificate bypass).
The script only uses Python standard library.
If a paper has no PDF link or download fails, it is skipped with a warning.
Existing files are not downloaded again.

Comments

Loading comments...