Paperbanana

v0.1.1

Generate publication-quality academic diagrams, methodology figures, architecture illustrations, and statistical plots from text descriptions using the Paper...

0· 317·0 current·0 all-time
byBennett@goatinahat
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description match the included scripts and README. The skill requires an LLM/VLM provider API key (Gemini/OpenAI/OpenRouter) and the 'uv' binary to run the packaged Python scripts — these are reasonable for an on-demand diagram/plot generation skill. The declared primary credential (GOOGLE_API_KEY) fits the documented auto-detection priority (Gemini → OpenAI → OpenRouter).
Instruction Scope
Runtime instructions and scripts explicitly read user-provided inputs (text files, CSV/JSON, image paths) and send them to external LLM/VLM providers for planning, image generation, and evaluation. Generated images may also be sent back to the provider for VLM-based evaluation. This is documented in SKILL.md and is coherent with the stated purpose, but it means any data you pass (including files you point to) will be transmitted to third-party APIs.
Install Mechanism
There is no registry install spec; the skill relies on 'uv' to create an isolated environment and install the PyPI package 'paperbanana[all-providers]'. Using PyPI for the package is expected. The README suggests installing 'uv' via a curl | sh one-liner (remote install script) — that is common but has the usual remote-install risks; verify the 'uv' install script and the PyPI package/project before running.
Credentials
The skill requests provider API keys (GOOGLE_API_KEY, OPENAI_API_KEY, OPENROUTER_API_KEY) which are necessary for the LLM/VLM and image-generation work it performs. No unrelated credentials, secrets, or system config paths are requested. Minor metadata mismatch: registry lists 'Required env vars: none' while primaryEnv is set to GOOGLE_API_KEY and SKILL.md says at least one provider key is required — this is a documentation inconsistency but not a functional mismatch.
Persistence & Privilege
The skill is not always-enabled and does not request elevated or persistent system privileges. It writes transient output under /tmp and does not modify other skills or system-wide configs. API keys are read from the environment/config and are not persisted by the skill.
Assessment
This skill appears internally consistent and implements the advertised workflow, but remember: (1) it sends whatever text, CSV/JSON, and images you provide to external LLM/VLM/image APIs — don't use it with sensitive or proprietary data unless your policy allows it; (2) it relies on 'uv' and a PyPI package (paperbanana[all-providers]) — verify the PyPI project and the GitHub repos linked in the SKILL.md/README before installing; (3) the README suggests installing 'uv' via a curl|sh command — review that script before running it; (4) provide a provider API key with appropriate billing/permissions and consider using a dedicated key/account for this skill to limit blast radius. The small documentation inconsistency about 'required env vars' is minor but worth noting.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

🍌 Clawdis
Binsuv
Primary envGOOGLE_API_KEY
latestvk97exn4bafrqqahv5thyhahq7n8228b6
317downloads
0stars
2versions
Updated 1mo ago
v0.1.1
MIT-0

PaperBanana — Academic Illustration Generator

Generate publication-quality academic diagrams and statistical plots from text descriptions. Uses a multi-agent pipeline (Retriever → Planner → Stylist → Visualizer → Critic) with iterative refinement.

Quick Reference

Generate a Diagram

uv run {baseDir}/scripts/generate.py \
  --context "Our framework consists of an encoder module that processes..." \
  --caption "Overview of the proposed encoder-decoder architecture"

Or from a file:

uv run {baseDir}/scripts/generate.py \
  --input /path/to/method_section.txt \
  --caption "Overview of the proposed method"

Options:

  • --iterations N — refinement rounds (default: 3)
  • --auto-refine — loop until critic is satisfied (use for final quality)
  • --aspect RATIO — aspect ratio: 1:1, 2:3, 3:2, 3:4, 4:3, 9:16, 16:9, 21:9
  • --provider gemini|openai|openrouter — override auto-detected provider
  • --format png|jpeg|webp — output format (default: png)
  • --no-optimize — disable input optimization (on by default)

Generate a Plot

uv run {baseDir}/scripts/plot.py \
  --data '{"model":["GPT-4","Claude","Gemini"],"accuracy":[92.1,94.3,91.8]}' \
  --intent "Bar chart comparing model accuracy across benchmarks"

Or from a CSV file:

uv run {baseDir}/scripts/plot.py \
  --data-file /path/to/results.csv \
  --intent "Line plot showing training loss over epochs"

Evaluate a Diagram

uv run {baseDir}/scripts/evaluate.py \
  --generated /path/to/generated.png \
  --reference /path/to/human_drawn.png \
  --context "The methodology section text..." \
  --caption "Overview of the framework"

Returns scores on: Faithfulness, Readability, Conciseness, Aesthetics.

Refine a Previous Diagram

uv run {baseDir}/scripts/generate.py \
  --continue \
  --feedback "Make the arrows thicker and use more distinct colors"

Or continue a specific run:

uv run {baseDir}/scripts/generate.py \
  --continue-run run_20260228_143022_a1b2c3 \
  --feedback "Add labels to each component box"

Setup

The skill auto-installs paperbanana on first use via uv (isolated, no global install). The package is published on PyPI by the llmsresearch team.

Required API keys: This skill requires at least one of the following API keys to function. Configure in ~/.openclaw/openclaw.json:

Env VariableProviderCostNotes
GOOGLE_API_KEYGoogle GeminiFree tier availableRecommended starting point
OPENAI_API_KEYOpenAIPaidBest quality (gpt-5.2 + gpt-image-1.5)
OPENROUTER_API_KEYOpenRouterPaidAccess to any model
{
  skills: {
    entries: {
      "paperbanana": {
        env: {
          // Option A: Google Gemini (free tier — recommended)
          GOOGLE_API_KEY: "AIza...",

          // Option B: OpenAI (paid, best quality)
          // OPENAI_API_KEY: "sk-...",

          // Option C: OpenRouter (paid, access to any model)
          // OPENROUTER_API_KEY: "sk-or-...",
        }
      }
    }
  }
}

Auto-detection priority: Gemini (free) → OpenAI → OpenRouter. The skill will exit with a clear error if no API key is found.

Provider Details

For provider comparison, model options, and advanced configuration: see {baseDir}/references/providers.md

Privacy & Data Handling

This skill sends user-provided data to external third-party APIs for diagram generation and evaluation:

  • Text content (context descriptions, captions, feedback) is sent to the configured LLM provider (Gemini, OpenAI, or OpenRouter) for planning and code generation.
  • Generated images may be sent back to the LLM provider for VLM-based evaluation and refinement.
  • CSV/JSON data provided for plot generation is sent to the LLM provider for Matplotlib code generation.

Do not use this skill with sensitive, confidential, or proprietary data unless your organization's data policies permit sending that data to the configured provider. All API calls go directly to the provider's endpoints — no intermediate servers are involved.

API keys are injected by OpenClaw from your local config (~/.openclaw/openclaw.json) and are never logged or transmitted beyond the provider's API.

Dependencies & Provenance

Behavior Notes

  • Input optimization is ON by default — enriches context and sharpens captions before generation. Disable with --no-optimize for speed.
  • Generation takes 1-5 minutes depending on iterations and provider. The script prints progress.
  • Output is delivered automatically via the MEDIA: protocol — no manual file handling needed.
  • Run continuation is the natural way to iterate: "make it better" → --continue --feedback "...".
  • Gemini free tier has rate limits (~15 RPM). Keep iterations ≤ 3 on free tier.

Comments

Loading comments...