Install
openclaw skills install paper-ingest-normalizerNormalize papers, PDFs, URLs, and literature notes into structured research records for project memory and retrieval. Use when: (1) a new paper, PDF, DOI, or article enters the system, (2) literature format is inconsistent, (3) researcher needs standardized extraction, (4) project memory needs clean paper records. Triggered by requests like read this paper, ingest this PDF, normalize this literature, 整理这篇文献, or when raw literature needs to become structured project memory.
openclaw skills install paper-ingest-normalizerConvert raw literature inputs into standardized records safe for project memory, paper databases, and downstream synthesis pipelines.
One of the following is required:
pdf_path — local path to PDF fileurl — link to paper/articleraw_text — extracted or pasted textmetadata_blob — existing metadata dictPlus:
project_id — required for any writebacksource_type — one of: pdf, doi, url, text, metadataoptional tags — list of strings for categorizationReturn a structured object:
title: string
authors: string[] | null
year: number | null
source: string # journal, conference, preprint, etc.
doi_or_url: string | null
project_id: string
paper_type: string # experimental, theoretical, review, etc.
material_system: string | null # e.g. "钙钛矿太阳能电池", " graphene FET"
device_type: string | null # e.g. "FTO/glass", "flexible substrate"
key_variables: string[] | null # independent variables studied
key_metrics: string[] | null # measured outcomes (PCE, mobility, etc.)
core_findings: string # 2-3 sentence neutral summary
claimed_mechanism: string | null
limitations: string | null
normalized_summary: string # 1-2 paragraph structured summary
uncertain_fields: string[] | null # fields that could not be verified
writeback_ready: boolean # true only if key identity fields present
writeback_payload: object # the record to write into project memory
null for missing fields; list in uncertain_fields.core_findings and normalized_summary grounded in what the text actually says.writeback_ready = false, list explicitly which fields are missing and why.For PDFs, use the summarize skill or pdfplumber/PyMuPDF to extract text before processing.
writeback_ready based on presence of key identity fieldsIf parsing is incomplete:
uncertain_fields with the list of fields that could not be determinedwriteback_ready = false when title, authors, or year are missingFor synthesis after normalization, see the research skill for paper synthesis workflows.