bytedance/UNO

[ICCV 2025] 🔥🔥 UNO: A Universal Customization Method for Both Single and Multi-Subject Conditioning

/ 100

Emerging

Uses in-context generation with diffusion transformers to synthesize high-consistency multi-subject paired training data, enabling a progressive cross-modal alignment architecture with universal rotary position embeddings. Builds on FLUX.1-dev as the base diffusion model and supports both single and multi-image conditioning for subject-driven generation. Provides training and inference implementations with fp8 quantization support for consumer GPUs (~16GB VRAM), plus a Hugging Face dataset (UNO-1M) and pre-trained weights.

1,353 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 16 / 25

How are scores calculated?

Stars

1,353

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

UCSC-VLAA/story-iter

[ICLR 2026] A Training-free Iterative Framework for Long Story Visualization

PaddlePaddle/PaddleMIX

Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks,...

keivalya/mini-vla

a minimal, beginner-friendly VLA to show how robot policies can fuse images, text, and states to...

adobe-research/custom-diffusion

Custom Diffusion: Multi-Concept Customization of Text-to-Image Diffusion (CVPR 2023)

byliutao/1Prompt1Story

🔥ICLR 2025 (Spotlight) One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation...

Explore Diffusion Models

All categories Trending Diffusion directory Insights