JIA-Lab-research/DreamOmni2

This project is the official implementation of 'DreamOmni2: Multimodal Instruction-based Editing and Generation''

/ 100

Established

Leverages a unified diffusion-based architecture with separate LoRA modules for editing and generation tasks, using multimodal encoders to process both text instructions and reference images for concrete object or abstract attribute guidance. Supports both subject-driven generation with identity/pose consistency and inpainting-aware editing that preserves non-edited regions while accepting visual references alongside natural language prompts. Available on Hugging Face with web demo interfaces and integrated with ComfyUI for production workflows.

2,273 stars.

No Package No Dependents

Maintenance 6 / 25

Adoption 10 / 25

Maturity 15 / 25

Community 19 / 25

How are scores calculated?

Stars

2,273

Forks

191

Language

Python

License

Apache-2.0

Related models

Vchitect/VBench

[CVPR2024 Highlight] VBench - We Evaluate Video Generation

VectorSpaceLab/OmniGen

OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340

EndlessSora/focal-frequency-loss

[ICCV 2021] Focal Frequency Loss for Image Reconstruction and Synthesis

PKU-YuanGroup/ChronoMagic-Bench

[NeurIPS 2024 D&B Spotlight🔥] ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of...

SkyworkAI/UniPic

Open-source SOTA multi-image editing model

Explore Diffusion Models

All categories Trending Diffusion directory Insights