论文关键词搜索和自动下载到指定目录 Keyword search for papers and automatic download!

v1.0.1

Search and download related arXiv papers by topic plus date range, or from a seed paper title/id. Use when user asks to crawl related papers, collect arXiv a...

1· 75·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for ppingzhang/paper-search-and-download-automatically.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "论文关键词搜索和自动下载到指定目录 Keyword search for papers and automatic download!" (ppingzhang/paper-search-and-download-automatically) from ClawHub.
Skill page: https://clawhub.ai/ppingzhang/paper-search-and-download-automatically
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install paper-search-and-download-automatically

ClawHub CLI

Package manager switcher

npx clawhub@latest install paper-search-and-download-automatically
Security Scan
VirusTotalVirusTotal
Pending
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
The name/description (search and download arXiv papers) matches the included script and SKILL.md. The script constructs arXiv API queries, parses the Atom feed, and downloads PDFs—exactly what the skill claims to do. No unrelated services, binaries, or credentials are requested.
Instruction Scope
SKILL.md instructs the agent to run the included Python script from the workspace root with clearly defined arguments and to report counts and output path. The script only reads its CLI args, performs HTTP(S) requests to arXiv endpoints, parses XML, and writes files to ./arxiv/. It does not attempt to read other files, environment variables, or external endpoints.
Install Mechanism
No install specification is provided (instruction-only + an included script). The script uses only Python standard library modules; nothing is downloaded or executed during install. This is low-risk and proportionate for the functionality.
Credentials
The skill requires no environment variables, credentials, or config paths. All network calls are to arXiv endpoints (export.arxiv.org and arxiv.org PDFs) which is appropriate for the described purpose.
Persistence & Privilege
The skill does not request always:true and does not modify other skills or system-wide config. It writes PDFs to a local ./arxiv/ directory (as expected) and has normal autonomous-invocation defaults.
Assessment
This skill appears to do what it says: query the arXiv API and download PDFs into ./arxiv/. Before installing/using it, consider: (1) run it in a workspace where writing ./arxiv/ is acceptable and you have disk space; (2) avoid very large max-results values to respect arXiv rate limits and site policies; (3) review/confirm any seed-title you pass (title matching can be imprecise); (4) verify you trust the skill source since the repository owner is unknown even though the code is small and uses only the Python standard library; and (5) if you need corporate/network auditability, run it where network egress to arxiv.org is allowed and logged.

Like a lobster shell, security has layers — review code before you run it.

latestvk976zx73kkhjh1sswk84cyc2s984xwjw
75downloads
1stars
2versions
Updated 1w ago
v1.0.1
MIT-0

arXiv Related Papers Downloader

What this skill does

  • Accepts either:
    • topic + time range, or
    • seed paper (arXiv id or title)
  • Finds related papers from arXiv API.
  • Downloads PDF files into arxiv/.
  • Uses filename format: versionDate-title.pdf, where versionDate is vN_YYYYMMDD.

When to use

Use this skill when the user asks to:

  • crawl/search related papers by topic;
  • find related papers from one article;
  • download arXiv PDFs in batch;
  • save with a deterministic naming rule.

Required user input

The user must provide one of these modes:

  1. Topic mode

    • topic
    • start date (YYYY-MM-DD)
    • end date (YYYY-MM-DD)
  2. Seed paper mode

    • seed arXiv id (preferred) or seed title
    • optional start date / end date

Optional:

  • max results (default: 20)

Execution steps

  1. Confirm missing parameters with the user.
  2. Run the script from workspace root:
# Topic mode
python ./scripts/download_arxiv.py \
  --topic "graph neural network" \
  --start-date 2024-01-01 \
  --end-date 2024-12-31 \
  --max-results 20

# Seed mode by arXiv id
python ./scripts/download_arxiv.py \
  --seed-id "2401.12345v1" \
  --max-results 20

# Seed mode by title
python ./scripts/download_arxiv.py \
  --seed-title "Attention Is All You Need" \
  --start-date 2018-01-01 \
  --end-date 2024-12-31 \
  --max-results 20
  1. Report back:
    • how many papers were found;
    • how many PDFs were downloaded;
    • the output directory path.

Output location and naming

  • Output dir: ./arxiv/ (auto-created if missing)
  • File naming rule:
    • v1_20240213-Your_Paper_Title.pdf
    • v3_20231105-Your_Paper_Title.pdf

Notes

  • HTTPS uses normal TLS verification only (no insecure certificate bypass).
  • The script only uses Python standard library.
  • If a paper has no PDF link or download fails, it is skipped with a warning.
  • Existing files are not downloaded again.

Comments

Loading comments...