PDF Extract MD + Figures

Split a PDF (especially a scientific paper / 论文) into clean body-text Markdown PLUS a folder of extracted figure images that are saved but NOT read into context, so tokens are never wasted on irrelevant figures. Text goes through markitdown; figures are extracted with PyMuPDF, auto-separated into real numbered figures vs junk (logos/ads/TOC), indexed in a manifest mapping page→Figure number, and only opened one-at-a-time on explicit request. Use this skill WHENEVER the user uploads a PDF and wants to read/analyze/process/对比/梳理 it, OR says things like 'turn this paper into markdown', '把这篇PDF拆一下', '用这个skill', '提取图片但先别读图', 'process this PDF', or hands over a paper expecting figure-aware analysis. Trigger even if they only say '测试一下' / 'analyze this PDF' after uploading — default to this split-first, read-figures-on-demand workflow rather than dumping the whole PDF (with all its images) into context.

Install

openclaw skills install @xjhveteran199-bit/pdf-extract-md-figs