Install
openclaw skills install arxiv-summarizer-orchestratorOrchestrates end-to-end arXiv paper retrieval, processing, and batch reporting with language control and parallel or serial paper handling modes.
openclaw skills install arxiv-summarizer-orchestratorRun the full pipeline by composing three sub-skills.
arxiv-search-collectorarxiv-paper-processorarxiv-batch-reporterlanguage: manual language parameter used by all stages. Default is English when omitted.paper_processing_mode: subagent_parallel or serial.max_parallel_papers: default 5 when paper_processing_mode=subagent_parallel.arxiv-search-collector/scripts/init_collection_run.py.query_plan.json (label + query only).arxiv-search-collector/scripts/fetch_queries_batch.py with the plan file (recommended).arxiv-search-collector/scripts/fetch_query_metadata.py manually for one-by-one fetch.arxiv-search-collector/scripts/merge_selected_papers.py.--incremental and updated selection-json.[]) to explicitly drop them.Pass --language <LANG> to collector scripts so all generated markdown files in Stage A follow the selected language.
Use serial query fetch in Stage A with conservative controls (for example --min-interval-sec 5, --retry-max 4).
Default collector settings already include retries/backoff and run-local throttle state (<run_dir>/.runtime/arxiv_api_state.json), so manual tuning is usually unnecessary.
Prefer cache reuse (no --force) unless query parameters changed or data refresh is required.
Output: one run directory with per-paper metadata subdirectories.
For each paper directory, invoke sub-skill arxiv-paper-processor once and let that skill produce <paper_dir>/summary.md.
Recommended pre-step for many papers:
python3 arxiv-paper-processor/scripts/download_papers_batch.py \
--run-dir /path/to/run \
--artifact source_then_pdf \
--max-workers 3 \
--min-interval-sec 5 \
--language <LANG>
Per-paper execution steps (inside arxiv-paper-processor):
<paper_dir>/summary.md already exists and is complete, skip this paper.source/source_extract/*.tex) or PDF (source/paper.pdf) already exists, skip download.arxiv-paper-processor/scripts/download_arxiv_source.py.arxiv-paper-processor/scripts/download_arxiv_pdf.py.<paper_dir>/summary.md by reference format, in language.Parallel strategy for many papers:
paper_processing_mode=subagent_parallel with max_parallel_papers=5.paper_processing_mode=serial to process one paper at a time.arxiv-paper-processor instances in batches; concurrent papers must not exceed max_parallel_papers.arxiv-paper-processor instance at a time.Output: all paper directories contain summary.md.
arxiv-batch-reporter/scripts/collect_summaries_bundle.py --language <LANG>.summaries_bundle.md and writes collection_report_template.md in base dir.{{ARXIV_BRIEF:<arxiv_id>}}.arxiv-batch-reporter/scripts/render_collection_report.py to generate final collection_report.md.summary.md section 10 via script injection.If language is non-English (for example Chinese), all intermediate markdown files and final reports should follow that language.
This orchestrator is suitable for cron/scheduled execution in OpenClaw:
1d, 7d, 30d) when initializing runs.<output-root>/<topic>-<timestamp>-<range>/
task_meta.json, task_meta.mdquery_results/, query_selection/<arxiv_id>/metadata.md + downloaded source/pdf + summary.mdsummaries_bundle.mdcollection_report_template.mdcollection_report.md)Use references/workflow-checklist.md as execution checklist.
This is the top-level orchestration skill.
Before using it, install and enable these three sub-skills:
arxiv-search-collectorarxiv-paper-processorarxiv-batch-reporterExecution order inside this orchestrator:
arxiv-search-collector (Stage A)arxiv-paper-processor (Stage B)arxiv-batch-reporter (Stage C)