HorizonWind2004/reconstruction-alignment

[ICLR 2026] Official repo of paper "Reconstruction Alignment Improves Unified Multimodal Models". Unlocking the Massive Zero-shot Potential in Unified Multimodal Models through Self-supervised Learning.

48
/ 100
Emerging

Reconstruction Alignment (RecA) applies self-supervised cross-modal reconstruction as an auxiliary task to align visual and semantic features in unified multimodal models, enabling zero-shot improvements without additional labeled data. The approach is architecture-agnostic—validated across Show-o (VQGAN/CLIP), BAGEL, Harmon, OpenUni, and MMaDA—requiring minimal computational overhead (6×A100s, 4.5 hours) while boosting generation quality and image editing capabilities. Models trained with RecA achieve competitive results on GenEval and DPGBench benchmarks, with 1.5B-parameter variants outperforming larger 7B-24B parameter baselines on zero-shot tasks.

378 stars.

No Package No Dependents
Maintenance 13 / 25
Adoption 10 / 25
Maturity 15 / 25
Community 10 / 25

How are scores calculated?

Stars

378

Forks

15

Language

Python

License

Apache-2.0

Last pushed

Mar 13, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/HorizonWind2004/reconstruction-alignment"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.