Install
openclaw skills install @zdywrnm/paperbanana-dashscopeGenerate academic figures and scientific diagrams from paper text using a multi-agent pipeline powered by Alibaba Cloud DashScope (Qwen-VL + Wanxiang/Qwen-Image). Use when the user wants to create figures for research papers, visualize methods sections, generate architecture diagrams, or produce illustrations for academic content. Supports diagram and plot tasks with multi-round critic refinement.
openclaw skills install @zdywrnm/paperbanana-dashscopeNative TypeScript CLI for generating academic figures from paper text. Zero Python dependencies. Powered by Alibaba Cloud DashScope.
npm install -g paperbanana-dashscope
paperbanana-dashscope --version
User must configure a DashScope API key. Check current status:
paperbanana-dashscope info
If no API key is configured, set one of:
# Option 1: Environment variable (simplest)
export OPENAI_API_KEY="sk-xxx"
# Option 2: Global config file
mkdir -p ~/.paperbanana-dashscope
cat > ~/.paperbanana-dashscope/config.yaml << 'YAML'
defaults:
main_model_name: "qwen-vl-max"
image_gen_model_name: "wanx2.1-t2i-turbo"
api_keys:
openai_api_key: "sk-xxx"
YAML
Generate a single figure from text:
paperbanana-dashscope generate \
--content "Method section text describing the architecture..." \
--caption "Figure 1: System overview" \
--output ~/Downloads/figure.png \
--num-candidates 1
| Option | Description | Default |
|---|---|---|
--content <text> | Paper text describing the method | required |
--caption <text> | Figure caption | required |
--output <path> | Output PNG file path | required |
--task <type> | diagram or plot | diagram |
--num-candidates <n> | Number of candidates to generate | 1 |
--max-critic-rounds <n> | Critic refinement iterations | 3 |
--aspect-ratio <ratio> | 1:1, 16:9, 4:3, 21:9, etc | 21:9 |
--main-model-name <id> | VLM for planning/critic | qwen-vl-max |
--image-gen-model-name <id> | Image generation model | wanx2.1-t2i-turbo |
DashScope supports three families of text-to-image models:
Wanxiang legacy (fast, cheap):
wanx2.1-t2i-turbo (default, fastest)wanx2.1-t2i-plus (better quality)Wanxiang 2.7 (latest, highest quality):
wan2.7-image-pro (professional, supports 4K output in text-to-image)wan2.7-image (standard, supports up to 2K, same pricing as wan2.6)Wanxiang 2.x (previous generation):
wan2.6-t2i (flagship of 2.6 series)wan2.5-t2i-previewwan2.2-t2i-flash / wan2.2-t2i-plusQwen-Image (best for figures with text labels):
qwen-image-plus (recommended for diagrams with English/Chinese labels)qwen-image-max (top-tier text rendering)Switch models inline:
paperbanana-dashscope generate \
--content "..." \
--caption "..." \
--image-gen-model-name wan2.6-t2i \
--output figure.png
Use --exp-mode to control which agents run:
| Mode | Agents | Use case |
|---|---|---|
vanilla | Vanilla only | Fastest, no refinement |
dev_planner | Planner only | Just generate description |
dev_planner_critic | Planner + Critic | With refinement loop |
dev_full | Planner + Stylist + Visualizer + Critic | Full pipeline |
demo_full | Same as dev_full + retriever | Default, best quality |
Quick draft (fast, low cost):
paperbanana-dashscope generate \
--content "..." \
--caption "..." \
--output draft.png \
--exp-mode vanilla \
--image-gen-model-name wanx2.1-t2i-turbo
High-quality figure for paper submission:
paperbanana-dashscope generate \
--content "..." \
--caption "..." \
--output paper_fig.png \
--image-gen-model-name wan2.6-t2i \
--num-candidates 3 \
--max-critic-rounds 5
Figure with English/Chinese text labels:
paperbanana-dashscope generate \
--content "..." \
--caption "..." \
--output labeled.png \
--image-gen-model-name qwen-image-plus
paperbanana-dashscope info and follow the configuration guide.npm update -g paperbanana-dashscope.