zai-org/CogView4

CogView4, CogView3-Plus and CogView3(ECCV 2024)

/ 100

Emerging

Implements text-to-image generation using Diffusion Transformer (DiT) architecture with cascading diffusion and relay diffusion frameworks, supporting variable resolutions up to 2048×2048 pixels. CogView4 (6B parameters) uses GLM-4-9B text encoder with native bilingual support, while CogView3-Plus uses T5-XXL encoder optimized for English. Integrates with Hugging Face Diffusers pipeline and supports CPU offloading strategies to reduce memory footprint from 35GB to 13GB on consumer hardware.

1,106 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 17 / 25

How are scores calculated?

Stars

1,106

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

Vchitect/VBench

[CVPR2024 Highlight] VBench - We Evaluate Video Generation

VectorSpaceLab/OmniGen

OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340

EndlessSora/focal-frequency-loss

[ICCV 2021] Focal Frequency Loss for Image Reconstruction and Synthesis

JIA-Lab-research/DreamOmni2

This project is the official implementation of 'DreamOmni2: Multimodal Instruction-based Editing...

PKU-YuanGroup/ChronoMagic-Bench

[NeurIPS 2024 D&B Spotlight🔥] ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of...

Explore Diffusion Models

All categories Trending Diffusion directory Insights