T2I Evaluation Benchmarks Diffusion Models

Benchmarks, datasets, and metrics for evaluating text-to-image generation quality and alignment. Does NOT include tools for generating images, training models, or prompt optimization.

There are 50 t2i evaluation benchmarks models tracked. 1 score above 70 (verified tier). The highest-rated is Vchitect/VBench at 73/100 with 1,537 stars and 3,530 monthly downloads. 1 of the top 10 are actively maintained.

Get all 50 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=diffusion&subcategory=t2i-evaluation-benchmarks&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Model	Score	Tier	Stars	Language
1	Vchitect/VBench [CVPR2024 Highlight] VBench - We Evaluate Video Generation	73	Verified	1,537	Python
2	VectorSpaceLab/OmniGen OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340	64	Established	4,313	Jupyter Notebook
3	EndlessSora/focal-frequency-loss [ICCV 2021] Focal Frequency Loss for Image Reconstruction and Synthesis	52	Established	706	Python
4	JIA-Lab-research/DreamOmni2 This project is the official implementation of 'DreamOmni2: Multimodal...	44	Emerging	2,273	Python
5	PKU-YuanGroup/ChronoMagic-Bench [NeurIPS 2024 D&B Spotlight🔥] ChronoMagic-Bench: A Benchmark for Metamorphic...	43	Emerging	210	Python
6	SkyworkAI/UniPic Open-source SOTA multi-image editing model	42	Emerging	863	Python
7	Amshaker/Mobile-O Mobile-O: Unified Multimodal Understanding and Generation on Mobile Device	40	Emerging	123	Python
8	ViStoryBench/vistorybench [CVPR 2026] ViStoryBench: AI Story Visualization Benchmark	39	Emerging	139	Python
9	nupurkmr9/syncd SynCD: Generating Multi-Image Synthetic Data for Text-to-Image Customization...	38	Emerging	154	Python
10	uni-medical/UniMedVL Official implementation of "UniMedVL: Unifying Medical Multimodal...	38	Emerging	66	Python
11	zai-org/CogView2 official code repo for paper "CogView2: Faster and Better Text-to-Image...	37	Emerging	955	Python
12	Karine-Huang/T2I-CompBench [Neurips 2023 & TPAMI] T2I-CompBench (++) for Compositional Text-to-image...	37	Emerging	334	Python
13	zai-org/CogView4 CogView4, CogView3-Plus and CogView3(ECCV 2024)	36	Emerging	1,106	Python
14	tobran/GALIP [CVPR2023] A faster, smaller, and better text-to-image model for large-scale training	35	Emerging	247	Python
15	OpenGVLab/GenExam GenExam: A Multidisciplinary Text-to-Image Exam	35	Emerging	62	Python
16	AIDC-AI/Ovis-U1 An unified model that seamlessly integrates multimodal understanding,...	35	Emerging	452	Python
17	JustusThies/NeuralTexGen Image-space texture optimization of 3D meshes using PyTorch	33	Emerging	73	Python
18	humansensinglab/ITI-GEN [ICCV 2023 Oral, Best Paper Finalist] ITI-GEN: Inclusive Text-to-Image Generation	32	Emerging	69	Python
19	inclusionAI/Ming-UniVision Code release for Ming-UniVision: Joint Image Understanding and Geneation...	32	Emerging	142	Python
20	360CVGroup/PlanGen Unified layout planning and image generation, ICCV2025	29	Experimental	41	Python
21	lxa9867/ImageFolder High-performance Image Tokenizers for VAR and AR	28	Experimental	303	Python
22	boomb0om/text2image-benchmark Benchmark for generative image models	27	Experimental	108	Jupyter Notebook
23	FoundationVision/OmniTokenizer [NeurIPS 2024]OmniTokenizer: one model and one weight for image-video joint...	27	Experimental	323	Python
24	KlingAIResearch/IMBA-Loss [ICCV 2025] Official Implementation of the Paper "Imbalance in Balance:...	27	Experimental	11	Python
25	migs2021/migs MIGS: Meta Image Generation from Scene Graphs (BMVC 2021)	27	Experimental	8	Python
26	microsoft/BizGenEval Bridging the gap between image generation and real-world design: a benchmark...	27	Experimental	10	Python
27	GordonChen19/STENCIL [ICIP2025 Spotlight] Efficient and High-Fidelity Image Generation	26	Experimental	3	JavaScript
28	EPFL-VILAB/search-over-tokens SoT is a framework for test-time search in autoregressive (AR) image...	26	Experimental	6	Jupyter Notebook
29	roeiherz/CanonicalSg2Im Code for "Learning Canonical Representations for Scene Graph to Image...	25	Experimental	30	Python
30	bcmi/F2GAN-Few-Shot-Image-Generation Fusing-and-Filling GAN (F2GAN) for few-shot image generation, ACM MM2020	24	Experimental	79	Python
31	yongchoooon/stellar [AAAI'26 Workshops Oral] STELLAR: Scene Text Editor for Low-Resource...	23	Experimental	6	C++
32	TIGER-AI-Lab/VIEScore Visual Instruction-guided Explainable Metric. Code for "Towards Explainable...	23	Experimental	67	Python
33	ali-vilab/IDEA-Bench Official repository of IDEA-Bench	22	Experimental	39	Python
34	yunqing-me/A-Closer-Look-at-FSIG The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR) 2022	22	Experimental	9	Python
35	hysts/CogView2_demo Unofficial demo app for CogView2	20	Experimental	16	Python
36	1jsingh/Divide-Evaluate-and-Refine Repo for our NeurIPS 2023 paper on: Divide, Evaluate, and Refine: Evaluating...	20	Experimental	27	Jupyter Notebook
37	matsuolab/multibanana [CVPR 2026 Main] MultiBanana: A Challenging Benchmark for Multi-Reference...	20	Experimental	20	Python
38	zeyofu/Commonsense-T2I Code for Commonsense-T2I Challenge: Can Text-to-Image Generation Models...	19	Experimental	24	Python
39	wzhlearning/Tex2Sem Official Implementation of “Tex2Sem: Learning from Textures to Semantics...	19	Experimental	5	Jupyter Notebook
40	bowen-upenn/ControlText ControlText: Unlocking Controllable Fonts in Multilingual Text Rendering...	19	Experimental	34	Python
41	FtmsdtHosseini/IDPL-PFOD An Image Dataset of Printed Farsi Text for OCR Research	18	Experimental	25	—
42	AIGCResearch/styleme3d Official repo for StyleMe3D	17	Experimental	28	—
43	yczhou001/LongBench-T2I Draw ALL Your Imagine: A Holistic Benchmark and Agent Framework for Complex...	17	Experimental	23	Python
44	360CVGroup/HiCo_T2I Layout Conditioned Image Generation, NeurIPS2024	17	Experimental	65	Python
45	hadi-hosseini/T2I-FineEval [ECCV 2024 Workshop EVAL-FoMo] T2I-FineEval: Fine-Grained Compositional...	17	Experimental	6	Python
46	j-min/VPGen Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)	16	Experimental	57	Jupyter Notebook
47	K1nght/T2I-ConBench T2I-ConBench: Text-to-Image Benchmark for Continual Post-training	15	Experimental	5	Python
48	pmh9960/GCDP Official PyTorch implementation of "Learning to Generate Semantic Layouts...	14	Experimental	46	Python
49	AIGCResearch/Awesome-Story-Visualization A Survey of Story Visualization	12	Experimental	1	—
50	HaoyuanYang-2023/ImagineFSL Official implementation of "ImagineFSL: Self-Supervised Pretraining Matters...	10	Experimental	26	Python

Comparisons in this category

VBench and ChronoMagic-Bench (73 vs 43) VBench and vistorybench (73 vs 39)