Paper2diagram

v1.0.2

论文 PDF → 方法/结构抽取 → 学术评审式总结 → 多张论文风格配图(依托 Gemini + nano_banana 网关)。

1· 92·1 current·1 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for qbc-oio/paper2diagram.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "Paper2diagram" (qbc-oio/paper2diagram) from ClawHub.
Skill page: https://clawhub.ai/qbc-oio/paper2diagram
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Required env vars: GEMINI_API_KEY, BANANA_PRO_API_KEY
Required binaries: python3
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install paper2diagram

ClawHub CLI

Package manager switcher

npx clawhub@latest install paper2diagram
Security Scan
VirusTotalVirusTotal
Pending
View report →
OpenClawOpenClaw
Benign
medium confidence
Purpose & Capability
Name/description match what the skill requests: it needs a Gemini API key and a Banana/nano_banana key to call LLM/image gateways and python3 to run a local workflow. Those credentials are proportional to producing summaries and images.
Instruction Scope
SKILL.md instructs the agent to read a local PDF and run a local Python module (python -m app.openclaw_main ...). That stays within the stated purpose, but the skill is instruction-only and expects you to clone a separate repo containing app.openclaw_main; without that repo the command will fail. The doc also suggests sending the PDF to the configured gateway(s) — you must trust those gateways before uploading sensitive PDFs.
Install Mechanism
No install spec (instruction-only) — nothing will be written by the skill itself. The README instructs you to clone an external repository and install dependencies locally; that is an explicit user action rather than an automated install.
Credentials
Declared required env vars are GEMINI_API_KEY and BANANA_PRO_API_KEY, which align with the stated integrations. SKILL.md additionally documents several optional env vars (GEMINI_BASE_URL, GEMINI_MODEL, BANANA_PRO_BASE_URL, BANANA_MODEL, ENABLE_BANANA) that are not listed in requires.env — a mild inconsistency. No unrelated credentials or config paths are requested.
Persistence & Privilege
always is false and the skill does not request persistent system-wide privileges. It does instruct running local Python code and writing images to an outputs/ directory, which is expected and limited to its own scope.
Assessment
This skill is an instruction-only wrapper that expects you to host or clone the actual workflow repository and to supply Gemini and Banana API keys. Before using it: (1) inspect and clone the referenced repo locally and review its code; (2) run it in an isolated environment (virtualenv/container); (3) only point GEMINI_BASE_URL / BANANA_PRO_BASE_URL to gateways you control or trust — the skill will send PDFs to whatever gateway you configure; (4) prefer local outputs/ copies of generated images rather than relying only on external temporary URLs; (5) limit API key permissions and rotate keys if you test with sensitive data. The skill appears coherent with its purpose, but you should verify the external repo and gateway behavior before running on confidential PDFs.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

🧠 Clawdis
Binspython3
EnvGEMINI_API_KEY, BANANA_PRO_API_KEY
Primary envGEMINI_API_KEY
latestvk975n1dzw4442fs8ztpzn1qm0984fksq
92downloads
1stars
3versions
Updated 2w ago
v1.0.2
MIT-0

简介

Paper2diagram 是一个面向科研工作者的「论文图文解析与可视化」技能,适合处理方法类/架构类论文(尤其是 CV / 医学图像 / 表征学习方向)。

它可以帮助代理完成:

  • 读取本地论文 PDF;
  • 自动聚焦到 Method / Architecture 章节;
  • 抽取主干网络结构与训练流程;
  • 用「资深学术评审」的口吻生成结构化总结(研究背景 / 核心创新点 / 实验结论 / 局限性等);
  • 调用网关中的 nano_banana 图像模型,自动生成多张论文风格配图(背景图、方法图、创新点示意图、实验结果条形图等),并保存在 outputs/ 目录。

快速上手(给最终用户看的示例)

当你的代理已经启用了本技能后,可以直接在对话里说:

请帮我分析这篇论文 /absolute/path/to/paper.pdf,用学术评审的方式总结研究背景、方法和创新点,并按照医学图像论文的风格自动画出结构图和实验对比图。

代理预期会:

  1. 调用本地 Python 工作流处理指定的 PDF;
  2. 输出结构化总结:研究背景 / 核心创新点 / 方法与结构 / 实验结论 / 局限性;
  3. 生成 3–5 张配图(方法主结构、创新点 callout、实验结果柱状图等),并返回:
    • 在线图片链接(由你的网关返回);
    • 本地保存路径(outputs/论文名__fig*.jpg),方便直接写入报告或幻灯片。

环境与依赖

  • 必须安装:Python 3(命令为 python3
  • 需要网络访问:连接到你自己配置的 LLM / 图像网关(例如 dongli gateway)
  • 环境变量(可以在 OpenClaw 的 skills.entries.paper2diagram.env 中配置,也可以在 shell 中设置):
    • GEMINI_API_KEY
    • GEMINI_BASE_URL(示例:https://api.dongli.work/v1beta
    • GEMINI_MODEL(示例:gemini-3-pro
    • BANANA_PRO_API_KEY
    • BANANA_PRO_BASE_URL(示例:https://api.dongli.work
    • BANANA_MODEL(示例:nano_banana_pro-1K
    • ENABLE_BANANA=true

安全提示:本技能只会将你提供的 PDF 通过你自己配置的网关(Gemini + nano_banana)进行处理,不会上传到其他第三方服务;
请仅在你信任的网关和私有环境中使用本技能,并在使用前阅读源代码。

本地部署(开发者 / 自托管)

ClawHub 只托管技能说明与元数据,实际的工作流逻辑在本仓库中实现,需要在本地拉取代码:

  1. 克隆项目并安装依赖:
git clone <YOUR_REPO_URL> paper2diagram-agent
cd paper2diagram-agent
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env  # 按需填入 KEY
  1. 确认 .env 中的 KEY 与 gateway 地址正确可用。

在 OpenClaw 中的典型调用流程

  1. 代理收到用户指令(示例见「快速上手」)。
  2. 代理通过本技能的工具,调用:
python -m app.openclaw_main local "<ABSOLUTE_PATH_TO_PDF>" 30
  1. 工具返回:
    • paper_analysis:论文结构化总结;
    • final_prompt:为绘图模型生成的英文 Prompt 信息;
    • render_results[]:每张图的在线链接与 local_image_path

其他说明

  • 图像链接可能是短期有效的临时 URL,建议优先使用 outputs/ 目录下的本地图片。
  • 若出现 403503 之类的错误,多半与网关额度、权限或模型名配置相关,本技能本身不会绕过网关安全策略。

Comments

Loading comments...