llmsresearch/paperbanana
Open source implementation and extension of Google Research’s PaperBanana for automated academic figures, diagrams, and research visuals, expanded to new domains like slide generation.
Uses a two-phase multi-agent pipeline with iterative refinement: an optional input optimization stage enriches methodology text and sharpens captions, followed by linear planning (retriever, planner, stylist) and iterative image generation with critic feedback. Supports multiple VLM and image generation providers (OpenAI GPT-5.2/gpt-image-1.5, Google Gemini free tier, Azure OpenAI, OpenRouter) via configurable environment variables, plus batch generation from YAML/JSON manifests and optional PDF input for context. Exposes generation workflows through CLI (Typer), Python API, MCP server for IDE integration, and a local Gradio studio web UI for interactive diagram/plot creation and run browsing.
1,142 stars and 6,878 monthly downloads. Actively maintained with 49 commits in the last 30 days. Available on PyPI.
Stars
1,142
Forks
160
Language
Python
License
MIT
Category
Last pushed
Mar 11, 2026
Monthly downloads
6,878
Commits (30d)
49
Dependencies
14
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/generative-ai/llmsresearch/paperbanana"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.