zai-org/CogView2

official code repo for paper "CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers"

44
/ 100
Emerging

Implements a three-stage hierarchical transformer (6B-9B-9B parameters) with custom local attention kernels for efficient token generation, featuring LoPAR acceleration and bidirectional completion via CogLM. Supports both text-to-image generation and text-guided inpainting with style control (photo, sketch, watercolor, etc.), optimized for A100 GPUs but scalable via batch size tuning. Built on SwissArmyTransformer framework with model hosting on Hugging Face Spaces and Replicate, primarily trained for Chinese/English text inputs.

955 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 18 / 25

How are scores calculated?

Stars

955

Forks

86

Language

Python

License

Apache-2.0

Last pushed

Aug 03, 2022

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/zai-org/CogView2"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.