Install
openclaw skills install agentic-paper-digest-skillFetches and summarizes recent arXiv and Hugging Face papers with Agentic Paper Digest. Use when the user wants a paper digest, a JSON feed of recent papers, or to run the arXiv/HF pipeline.
openclaw skills install agentic-paper-digest-skillOPENAI_API_KEY or an OpenAI-compatible provider via LITELLM_API_BASE + LITELLM_API_KEY.git is optional for bootstrap; otherwise curl/wget (or Python) is used to download the repo.bash "{baseDir}/scripts/bootstrap.sh"
PROJECT_DIR.PROJECT_DIR="$HOME/agentic_paper_digest" bash "{baseDir}/scripts/bootstrap.sh"
bash "{baseDir}/scripts/run_cli.sh"
bash "{baseDir}/scripts/run_cli.sh" --window-hours 24 --sources arxiv,hf
bash "{baseDir}/scripts/run_api.sh"
curl -X POST http://127.0.0.1:8000/api/run
curl http://127.0.0.1:8000/api/status
curl http://127.0.0.1:8000/api/papers
bash "{baseDir}/scripts/stop_api.sh"
--json prints run_id, seen, kept, window_start, and window_end.data/papers.sqlite3 (under PROJECT_DIR).POST /api/run, GET /api/status, GET /api/papers, GET/POST /api/topics, GET/POST /api/settings.Config files live in PROJECT_DIR/config. Environment variables can be set in the shell or via a .env file. The wrappers here auto-load .env from PROJECT_DIR (override with ENV_FILE=/path/to/.env).
Environment (.env or exported vars)
OPENAI_API_KEY: required for OpenAI models (litellm reads this).LITELLM_API_BASE, LITELLM_API_KEY: use an OpenAI-compatible proxy/provider.LITELLM_MODEL_RELEVANCE, LITELLM_MODEL_SUMMARY: models for relevance and summarization (summary defaults to relevance model if unset).LITELLM_TEMPERATURE_RELEVANCE, LITELLM_TEMPERATURE_SUMMARY: lower for more deterministic output.LITELLM_MAX_RETRIES: retry count for LLM calls.LITELLM_DROP_PARAMS=1: drop unsupported params to avoid provider errors.WINDOW_HOURS, APP_TZ: recency window and timezone.ARXIV_CATEGORIES: comma-separated categories (default includes cs.CL,cs.AI,cs.LG,stat.ML,cs.CR).ARXIV_API_BASE, HF_API_BASE: override source endpoints if needed.ARXIV_MAX_RESULTS, ARXIV_PAGE_SIZE: arXiv paging limits.MAX_CANDIDATES_PER_SOURCE: cap candidates per source before LLM filtering.FETCH_TIMEOUT_S, REQUEST_TIMEOUT_S: source fetch and per-request timeouts.ENABLE_PDF_TEXT=1: include first-page PDF text in summaries; requires PyMuPDF (pip install pymupdf).DATA_DIR: location for papers.sqlite3.CORS_ORIGINS: comma-separated origins allowed by the API server (UI use).TOPICS_PATH, SETTINGS_PATH, AFFILIATION_BOOSTS_PATH.Config files
config/topics.json: list of topics with id, label, description, max_per_topic, and keywords. The relevance classifier must output topic IDs exactly as defined here. max_per_topic also caps results in GET /api/papers when apply_topic_caps=1.config/settings.json: overrides fetch limits (arxiv_max_results, arxiv_page_size, fetch_timeout_s, max_candidates_per_source). Updated via POST /api/settings.config/affiliations.json: list of {pattern, weight} boosts applied by substring match over affiliations. Weights add up and are capped at 1.0. Invalid JSON disables boosts, so keep the file strict JSON (no trailing commas).config/topics.json, config/settings.json, and config/affiliations.json (if present).config/topics.json (topics[].id/label/description/keywords, max_per_topic).WINDOW_HOURS (or pass --window-hours to CLI) only if the user cares; otherwise keep default to 24h.ARXIV_CATEGORIES, ARXIV_MAX_RESULTS, ARXIV_PAGE_SIZE, MAX_CANDIDATES_PER_SOURCE.OPENAI_API_KEY or LITELLM_API_KEY (+ LITELLM_API_BASE if proxy), and set LITELLM_MODEL_RELEVANCE/LITELLM_MODEL_SUMMARY.PROJECT_DIR="$HOME/agentic_paper_digest" if the user doesn’t care. Never hardcode /Users/... paths..env:
.env is missing, create it from .env.example (in the repo), then ask the user to fill keys and any requested preferences.OPENAI_API_KEY or LITELLM_API_KEY is set before running.POST /api/topics and POST /api/settings if running the API).scripts/run_cli.sh for one-off JSON output.scripts/run_api.sh only if the user explicitly asks for UI/API access or polling.WINDOW_HOURS, ARXIV_MAX_RESULTS, or broadening topics.WINDOW_HOURS or ARXIV_MAX_RESULTS when results are sparse, or lower them if results are too noisy.ARXIV_CATEGORIES to your research domains.ENABLE_PDF_TEXT=1) when abstracts are too thin.bash "{baseDir}/scripts/stop_api.sh" or pass --port to the API command.WINDOW_HOURS or verify the API key in .env.OPENAI_API_KEY or LITELLM_API_KEY in the shell before running.