Arxiv Paper Reader

v1.0.3

Search arXiv by keyword, filter by submitted date range, fetch arXiv papers from an arXiv ID or URL, convert papers into Markdown and PDF files in the worksp...

0· 175·1 current·1 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for elio040208/arxiv-paper-reader.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "Arxiv Paper Reader" (elio040208/arxiv-paper-reader) from ClawHub.
Skill page: https://clawhub.ai/elio040208/arxiv-paper-reader
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install arxiv-paper-reader

ClawHub CLI

Package manager switcher

npx clawhub@latest install arxiv-paper-reader
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description (search/fetch/convert arXiv papers) matches the code and declared requirements. The scripts only require a Python interpreter and interact with arXiv endpoints (export.arxiv.org and arxiv.org). No unrelated binaries, credentials, or config paths are requested.
Instruction Scope
SKILL.md instructs the agent to run the included Python scripts, read/write files under workspace directories (artifacts/, topics.json, runs/), and to only accept raw arXiv IDs or arXiv HTTPS URLs. The instructions emphasize safety (pass args as single CLI args, strict TLS) and tell the agent to read the generated search_results.md/json and produced paper files. There is no instruction to read unrelated system files or environment variables.
Install Mechanism
No install spec — instruction-only with bundled scripts. This is low-risk from an installer perspective because nothing is downloaded or installed automatically. The only runtime requirement is a local Python interpreter.
Credentials
No environment variables, credentials, or config paths are required. The scripts write outputs to workspace subdirectories (artifacts/arxiv* and a configurable root-dir) which is appropriate for the described functionality.
Persistence & Privilege
always is false and the skill does not request persistent platform privileges. It does create and update files under user-specified or default workspace directories (artifacts, runs, sync_state.json). That file I/O is expected for an archiving/syncing tool.
Assessment
This skill appears coherent and limited to arXiv interactions, but exercise normal caution: 1) run it in a controlled workspace (it will create artifacts/ and monitor topic folders), 2) inspect the bundled scripts yourself before running (they will execute arbitrary Python locally), 3) ensure your environment has a proper CA bundle (the code enforces TLS and references certifi), and 4) don't point it at non-arXiv URLs — the code enforces allowed hosts but follow the SKILL.md rule to only pass raw arXiv IDs or arXiv HTTPS URLs. If you need higher assurance, run the scripts in an isolated environment (container/VM) and review the remaining truncated code paths before granting broad autonomous invocation.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

OSWindows · Linux · macOS
Any binpython, python3
latestvk97e0egzgmmbxfdrasgjtx99k983qa5t
175downloads
0stars
4versions
Updated 1mo ago
v1.0.3
MIT-0
Windows, Linux, macOS

arXiv Paper Reader

Use the bundled Python scripts before reasoning about arXiv content. They handle:

  • searching arXiv by keyword
  • filtering keyword results by submitted date range
  • downloading arXiv metadata and paper content
  • converting papers to Markdown and PDF in the workspace
  • syncing configured topics into daily archive folders

Inputs

  • Accept raw arXiv IDs like 1706.03762 or URLs such as https://arxiv.org/abs/1706.03762.
  • Only accept raw IDs or HTTPS arXiv URLs on arxiv.org, www.arxiv.org, or export.arxiv.org.
  • Accept keyword searches such as transformer, diffusion, or computer vision.
  • Accept optional submitted-date windows using YYYY-MM-DD.
  • Do not use category filters or alias-based domain shortcuts; search is intentionally keyword-only.

Search workflow

  1. Pick a Python command:
    • Prefer python
    • Fall back to python3
  2. If the user wants search results or the latest papers for a topic, run:
python {baseDir}/scripts/search_arxiv.py --query "<keywords>" --limit <n>
  1. Read search_results.md and search_results.json.
  2. Use {baseDir}/references/search-usage.md to present the results.
  3. If the user asks for the latest papers matching a keyword, pass --sort submittedDate.
  4. If the user wants the default best-match ranking, omit --sort and let the script use relevance order.
  5. If the user gives a date window, add --start-date YYYY-MM-DD --end-date YYYY-MM-DD.

Topic sync workflow

  1. Tell the user to maintain {rootDir}/topics.json, or seed it from {baseDir}/references/topics.example.json.
  2. For recurring daily updates, run:
python {baseDir}/scripts/sync_arxiv_topics.py --daily --root-dir <root-dir>
  1. For manual backfill, run:
python {baseDir}/scripts/sync_arxiv_topics.py --start-date YYYY-MM-DD --end-date YYYY-MM-DD --root-dir <root-dir>
  1. Read <root-dir>/runs/<capture-date>/run_manifest.md first.
  2. Each captured paper lives at topics/<topic-slug>/<capture-date>/<paper-id>__<title-slug>/.
  3. Expect each paper directory to contain paper.pdf, paper.md, metadata.json, and summary.md.
  4. The batch summary is template-based and grounded in the abstract plus converted Markdown; treat it as a review aid, not a substitute for reading the paper.

Fetch workflow

  1. Choose an output directory:
    • If the user gives one, use it.
    • Otherwise write to ./artifacts/arxiv/<paper-id>/ in the current workspace.
  2. Run the converter:
python {baseDir}/scripts/arxiv_to_md.py <paper-id-or-url> --output-dir <target-dir>
  1. Read the generated paper.pdf, paper.md, and metadata.json.
  2. Summarize the paper in Markdown.
  3. Save the summary to <target-dir>/summary.md if the user asked for files. Otherwise return the summary directly in chat.

Summary format

Use the headings in {baseDir}/references/summary-format.md.

Keep the summary grounded in the generated Markdown. If the conversion falls back to abstract-only mode, say so explicitly in the summary.

Safety

  • Pass IDs, URLs, and keywords as single CLI arguments. Do not splice untrusted text into shell pipelines.
  • Only pass raw arXiv IDs or HTTPS arXiv URLs; reject arbitrary third-party URLs.
  • TLS verification is strict. If requests fail because your machine lacks a valid CA bundle, install certifi or fix the system trust store.
  • arXiv source archives are processed in-memory, only .tex members are read, and suspicious paths plus oversized payloads are rejected before parsing.
  • Date windows use arXiv submittedDate and inclusive YYYY-MM-DD boundaries.
  • Do not invent claims that are not supported by paper.md or search_results.md.
  • Do not reintroduce hardcoded category or alias mappings; keep search behavior keyword-only.

Comments

Loading comments...