paper-summary-json

v1.0.0

structured academic paper analysis from local paper files or paper urls, adapted from a dify scheme a workflow. use when the user asks to analyze pdf/docx/te...

⭐ 0· 29·0 current·0 all-time

by@crw0149

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for crw0149/paper-summary-json.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "paper-summary-json" (crw0149/paper-summary-json) from ClawHub.
Skill page: https://clawhub.ai/crw0149/paper-summary-json
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install paper-summary-json

ClawHub CLI

Package manager switcher

npx clawhub@latest install paper-summary-json

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Benign

high confidence

✓

Purpose & Capability

The name/description (structured academic paper analysis) aligns with the included scripts and SKILL.md. The code copies local files, downloads URLs, extracts text, asks the model for JSON extraction/verification, and renders reports — all consistent with the stated purpose. Minor naming mismatches exist (registry slug 'paper-summary-json' vs SKILL.md/'interface' display_name 'paper-analysis-evidence'), but that's likely cosmetic.

ℹ

Instruction Scope

Instructions explicitly require copying local files and downloading URL inputs, extracting text, generating prompts, sending prompts to the model, and saving generated JSON and reports. This is in-scope for a paper analysis tool. Important operational behaviors are explicit: files are copied (originals not modified), downloads occur from provided URLs, and outputs are saved under the user's Desktop. Users should note the default output location and that the skill will fetch remote URLs you supply.

ℹ

Install Mechanism

There is no install spec (instruction-only) which is low-risk, but the bundle includes Python scripts that assume various optional dependencies (pypdf or PyPDF2, python-docx, markdown, and an external pdftotext binary/poppler). Dependencies are not declared in metadata; running the scripts in a typical environment may fail or require installing packages and system utilities. The scripts also invoke subprocesses (pdftotext) and urllib.request for downloads — expected for this use case but worth noting operationally.

✓

Credentials

The skill requests no environment variables or credentials. It does perform network downloads only from URLs you provide and writes files into ~/Desktop/paper_analysis_results — no hidden credential access or unrelated env read is present in the provided code/instructions.

✓

Persistence & Privilege

always:false and no claim of modifying other skills or system-wide agent settings. The skill will create per-run directories and output files under the user's Desktop (persistent on disk), which is documented in SKILL.md. Autonomous invocation is allowed (platform default) but not combined with elevated privileges in this package.

Assessment

This skill appears to do what it says: copy or download papers, extract and clean text, call your model for structured extraction and verification, and render reports. Before installing or running it: - Be aware it will save downloaded files, intermediate text, JSON, and final reports to ~/Desktop/paper_analysis_results/<timestamp>/ by default — change the location if you don't want files on your Desktop. - It will download any URLs you pass to it; only provide URLs you trust (remote PDFs can contain malicious payloads or trigger unwanted network activity). Run the tool in a sandbox if analyzing untrusted documents. - Ensure required dependencies are present: Python 3, and optionally pypdf or PyPDF2 (for PDF extraction) or the pdftotext binary (poppler-utils), plus python-docx/markdown if you want .docx/.html rendering to work. The skill does not declare these dependencies, so you may need to install them manually. - Note small naming inconsistencies (skill slug vs SKILL.md and interface display_name) — cosmetic but worth checking you have the correct skill. If you want stronger safety: run in an isolated environment (container or VM), inspect outputs under the declared Desktop path, and validate that you only provided trusted URLs and local files.

Like a lobster shell, security has layers — review code before you run it.

latestvk976tm475j8n16877k4ckjbcyd85my0k

29downloads

0stars

1versions

Updated 13h ago

v1.0.0

MIT-0

Paper Analysis Evidence

Purpose

Run the Scheme A evidence-enhanced paper analysis workflow: prepare paper inputs, split the paper into key sections, generate structured extraction JSON, verify the extraction against the original text, and render final reports.

This skill is based on the uploaded Dify workflow 论文分析系统_方案A_结构化证据增强版.

Runtime file policy

Always save runtime downloads and generated outputs under the Ubuntu desktop unless the user explicitly requests another location:

~/Desktop/paper_analysis_results/<YYYYMMDD_HHMMSS>/

Do not modify the original local paper file. Copy it into the work directory before extraction. Download URL inputs into the same batch work directory.

Inputs

Accept:

language: 中文 or 英文; default to 中文 when unspecified.
paper_files: one or more local paper files, preferably PDF, DOCX, TXT, MD, or HTML.
paper_urls: one or more PDF/direct paper URLs, comma-separated or repeated.

If both local files and URLs are empty, stop with this message:

上传的文件和论文URL不能同时为空。

Workflow

1. Prepare inputs and sections

Run:

python scripts/prepare_papers.py --language 中文 --files /path/to/paper.pdf --urls "https://example.com/paper.pdf"

Use only the relevant arguments. For URL-only runs, omit --files; for local-only runs, omit --urls.

The script creates manifest.json and one work directory per paper. It performs:

local file copy or URL download,
raw text extraction,
text cleaning,
section splitting into abstract, intro, method, experiment, conclusion, and paper_body,
prompt file generation.

2. Generate structured extraction JSON

For each paper in manifest.json, read:

prompts/01_structured_extraction_prompt.md

Send that prompt to the model. Save the model response exactly as JSON-only content to:

generated/structured_result.json

Required JSON fields:

{
  "title": "",
  "task": "",
  "background": "",
  "problem_statement": "",
  "method_name": "",
  "method_core": "",
  "datasets": [],
  "baselines": [],
  "metrics": [],
  "main_results": [
    {"dataset": "", "metric": "", "value": "", "baseline": "", "improvement": ""}
  ],
  "ablations": [],
  "limitations": [],
  "claims": [],
  "contributions": [],
  "evidence_spans": [
    {"field": "", "claim": "", "evidence": ""}
  ]
}

Extraction rules:

Only use information present in, or directly inferable from, the paper.
Prefer corresponding sections, but fall back to the full paper_body when a section is empty or insufficient.
Do not leave datasets, baselines, or metrics empty just because the experiment section is weak; first check paper_body, result text, implementation details, and table-neighboring text.
Use empty strings or arrays only when the full paper text truly lacks the information.
Provide at least 6 evidence spans. Each evidence span must be a direct quote or a very close paraphrase from the source text.
Prioritize numeric results from experiment, results, analysis, implementation details, or table-neighboring text.
Keep JSON keys in English. Natural-language values must use the selected output language.

3. Run consistency verification

Open:

prompts/02_verification_prompt_template.md

Replace {{structured_json}} with the actual content of generated/structured_result.json. Send the complete verification prompt to the model and save JSON-only output to:

generated/verification_result.json

Required verification JSON:

{
  "overall_score": 0,
  "hallucination_risk": "low/medium/high",
  "issues": [
    {"field": "", "problem": "", "severity": "low/medium/high"}
  ],
  "verified_claims": [
    {"claim": "", "status": "supported/weak/unsupported", "evidence": ""}
  ],
  "final_verdict": ""
}

Verification rules:

Score 5: nearly no hallucination, strong evidence.
Score 4: minor imprecision.
Score 3: several claims lack evidence.
Score 2: clear inconsistency exists.
Score 1: substantial hallucination or misreading.
Focus on omitted or incorrect datasets, baselines, metrics, and main results.
If the structured extraction uses an empty array/string for information that exists in the original paper, explicitly list that in issues.
Provide at least 4 verified claims.

4. Render reports

After structured_result.json and verification_result.json are saved for every paper, run:

python scripts/render_report.py --manifest ~/Desktop/paper_analysis_results/<YYYYMMDD_HHMMSS>/manifest.json

Outputs per paper:

report/final_report.md
report/final_report.html
report/final_report.docx

The .md file preserves editable Markdown source. The .html file is the rendered visual version. The .docx file is the Word-compatible report.

Report structure

Chinese report sections:

论文题目
任务与问题
方法概述
实验要素：数据集、基线方法、评价指标
主要结果
贡献提炼
消融与局限性
证据片段
一致性校验：总评分、幻觉风险、最终结论、已核验结论、发现的问题

English report sections mirror the same structure as Paper Analysis.

References

Use references/prompt_templates.md when prompt details are needed.
Use references/workflow_mapping.md when checking how the Dify nodes map to this skill.
references/dify_scheme_a_source.yml preserves the uploaded Dify DSL source for auditability.

Comments

Loading comments...