{"skill":{"slug":"paper-notion-summarizer","displayName":"paper-notion-summarizer","summary":"Fetch paper metadata by title or arXiv/DOI link, create a deep structured summary, and post it as a Notion page. The agent reads the paper and writes a semin...","description":"---\nname: paper-notion-summarizer\ndescription: Fetch paper metadata by title or arXiv/DOI link, create a deep structured summary, and post it as a Notion page. The agent reads the paper and writes a seminar-quality summary adapted to the user's language.\n---\n\n# Paper Notion Summarizer\n\n## Purpose\n\nGiven a paper title or arXiv link, create a **seminar-quality, deeply structured summary** and upload it to Notion.\n\n> ⚠️ This is NOT an extractive summary. The **agent reads the full paper and writes an original analysis**.\n\n## Language Adaptation\n\n**Write the summary in the same language as the user's request.**\n- If the user writes in Korean → write in Korean (technical terms in English)\n- If the user writes in English → write in English\n- If the user writes in Japanese → write in Japanese\n- And so on for any language.\n\nSection headings in the summary JSON should match the user's language. The English template below is the canonical structure — adapt headings to the user's language.\n\n## Workflow (3 phases)\n\n### Phase 1: Extract paper content\n\n```bash\npython3 scripts/extract_paper.py \\\n  --output /tmp/paper_extract.json \\\n  \"https://arxiv.org/abs/2301.12345\"\n```\n\nOr by title:\n```bash\npython3 scripts/extract_paper.py \\\n  --output /tmp/paper_extract.json \\\n  --title \"Attention Is All You Need\"\n```\n\nOptions:\n- `--output`, `-o`: Output file path (defaults to stdout)\n- `--skip-fulltext`: Extract abstract only (fast mode, skip PDF)\n- `--doi`: Explicit DOI\n- `--arxiv-id`: Explicit arXiv ID\n\n### Phase 2: Agent reads and writes the summary\n\nRead the extracted JSON section by section (`read` tool with `offset`/`limit` for large files), then write a structured summary JSON to `/tmp/paper_summary.json`.\n\n#### Reading strategy (context management)\n- Read Abstract → Introduction → Method → Experiments → Conclusion in order\n- For long papers, read in chunks and accumulate understanding\n- Focus on: core idea, key equations, experimental setup, main results, ablations\n\n#### Summary JSON template\n\n```json\n{\n  \"title\": \"Paper Title (original language)\",\n  \"metadata\": {\n    \"authors\": \"Author list\",\n    \"year\": \"2024\",\n    \"venue\": \"NeurIPS 2024\",\n    \"doi\": \"10.xxxx/xxxxx\",\n    \"url\": \"https://arxiv.org/abs/xxxx.xxxxx\",\n    \"source\": \"arXiv\"\n  },\n  \"sections\": [\n    {\n      \"heading\": \"0. Metadata\",\n      \"content\": \"- Authors: ...\\n- Year: ...\\n- Venue: ...\\n- Code: ...\"\n    },\n    {\n      \"heading\": \"1. One-line Summary\",\n      \"content\": \"What this paper does in one sentence.\"\n    },\n    {\n      \"heading\": \"2. Problem & Motivation\",\n      \"content\": \"- What problem does it solve?\\n- Why are existing methods insufficient?\\n- Why is this research needed?\"\n    },\n    {\n      \"heading\": \"3. Key Contributions\",\n      \"content\": \"1. First contribution\\n2. Second contribution\\n3. Third contribution\"\n    },\n    {\n      \"heading\": \"4. Method\",\n      \"content\": \"Detailed pipeline/architecture description.\\nCore ideas, key equations included.\\n\\n### Core Idea\\n...\\n\\n### Architecture\\n...\\n\\n### Training\\n...\\n\\n### Key Equations\\n$$equation$$\"\n    },\n    {\n      \"heading\": \"5. Experiments\",\n      \"content\": \"### Setup\\n- Datasets: ...\\n- Baselines: ...\\n- Metrics: ...\\n\\n### Main Results\\n- Key numbers and comparisons\\n- Where it works and where it doesn't\"\n    },\n    {\n      \"heading\": \"6. Ablation & Analysis\",\n      \"content\": \"- Per-component contributions\\n- Interesting analysis results\\n- Hyperparameter sensitivity\"\n    },\n    {\n      \"heading\": \"7. Limitations & Future Work\",\n      \"content\": \"- Author-acknowledged limitations\\n- Additional limitations you identify\\n- Future research directions\"\n    },\n    {\n      \"heading\": \"8. Overall Assessment\",\n      \"content\": \"- Research significance\\n- Strengths and weaknesses\\n- Connections to related work\\n- Ideas applicable to user's research\"\n    }\n  ]\n}\n```\n\n#### Quality guidelines\n\n1. **Terminology**: Keep technical terms in their original language; explanations in the user's language.\n2. **Equations**: Include key equations in LaTeX (`$$ ... $$`).\n3. **Depth**: Seminar-presentation level understanding.\n   - Method: Not just \"they did X\" but \"why they designed it this way, what each component does\"\n   - Experiments: Not just \"it worked\" but \"X% improvement over Y baseline under Z conditions\"\n4. **Critical perspective**: Record limitations and open questions, not just strengths.\n5. **Connections**: If you know the user's research interests, connect the paper to them.\n6. **No programming code blocks**: Do NOT use fenced code blocks (``` ```) in `sections[*].content`. Math expressions (`$$ ... $$`, `` ```latex ``) are allowed.\n7. **No emoji in headings**: Use numbered prefixes: `0. Metadata`, `1. One-line Summary`, etc.\n\n### Phase 3: Push to Notion\n\n```bash\npython3 scripts/push_to_notion.py \\\n  /tmp/paper_summary.json \\\n  --parent-page-id YOUR_PAGE_ID\n```\n\nOptions:\n- `--parent-page-id`: Notion page ID to create the summary under\n- `--force-update`: Overwrite existing page with same title\n- `--dry-run`: Preview without uploading\n- `--notion-key`: Explicit Notion API token\n\n## Quick Start (full agent flow)\n\n```\n1. python3 scripts/extract_paper.py -o /tmp/paper_extract.json \"https://arxiv.org/abs/...\"\n2. read /tmp/paper_extract.json (section by section)\n3. Write summary → /tmp/paper_summary.json\n4. python3 scripts/push_to_notion.py /tmp/paper_summary.json --parent-page-id PAGE_ID\n```\n\n## Configuration\n\n| Config | Source | Description |\n|--------|--------|-------------|\n| Notion API key | `NOTION_API_KEY` env or `~/.config/notion/api_key` | Required for Notion upload |\n| Parent page | `NOTION_PARENT_PAGE_ID` env or `--parent-page-id` | Notion page to create summaries under |\n\n## Notes\n\n- arXiv papers use PDF extraction (requires `pypdf`). Install: `pip install pypdf`\n- For very long papers (>100 pages), use `--skip-fulltext` and read HTML via `web_fetch`.\n- Notion API version: `2025-09-03`\n- The `extract_paper.py` script does NOT require a Notion API key — it only fetches and extracts.\n","tags":{"latest":"1.0.0"},"stats":{"comments":0,"downloads":357,"installsAllTime":0,"installsCurrent":0,"stars":2,"versions":1},"createdAt":1771921722472,"updatedAt":1778992824008},"latestVersion":{"version":"1.0.0","createdAt":1771921722472,"changelog":"- Initial release of paper-notion-summarizer.\n- Fetches paper metadata by title, arXiv, or DOI and generates a deeply structured, seminar-quality summary.\n- Summaries are written in the user's language, with technical terms preserved in their original form.\n- Includes critical analysis, key equations in LaTeX, and explicit structure for method, experiments, limitations, and more.\n- Directly uploads summary as a well-formatted Notion page under a specified parent page.\n- Designed with a clear three-phase workflow: extract, summarize, and push to Notion.","license":null},"metadata":null,"owner":{"handle":"lococaeco","userId":"s17fkv7bvn0w54bzyh2je6zf89884n6g","displayName":"lococaeco","image":"https://avatars.githubusercontent.com/u/122510668?v=4"},"moderation":{"isSuspicious":false,"isMalwareBlocked":false,"verdict":"clean","reasonCodes":["review.llm_review"],"summary":"Review: review.llm_review","engineVersion":"v2.4.24","updatedAt":1779956082461}}