zai-org/CogView4
CogView4, CogView3-Plus and CogView3(ECCV 2024)
Implements text-to-image generation using Diffusion Transformer (DiT) architecture with cascading diffusion and relay diffusion frameworks, supporting variable resolutions up to 2048×2048 pixels. CogView4 (6B parameters) uses GLM-4-9B text encoder with native bilingual support, while CogView3-Plus uses T5-XXL encoder optimized for English. Integrates with Hugging Face Diffusers pipeline and supports CPU offloading strategies to reduce memory footprint from 35GB to 13GB on consumer hardware.
1,106 stars. No commits in the last 6 months.
Stars
1,106
Forks
80
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 29, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/zai-org/CogView4"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Vchitect/VBench
[CVPR2024 Highlight] VBench - We Evaluate Video Generation
VectorSpaceLab/OmniGen
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
EndlessSora/focal-frequency-loss
[ICCV 2021] Focal Frequency Loss for Image Reconstruction and Synthesis
JIA-Lab-research/DreamOmni2
This project is the official implementation of 'DreamOmni2: Multimodal Instruction-based Editing...
PKU-YuanGroup/ChronoMagic-Bench
[NeurIPS 2024 D&B Spotlight🔥] ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of...