Alpha-VLLM/Lumina-T2X

Lumina-T2X is a unified framework for Text to Any Modality Generation

42
/ 100
Emerging

Using flow-based diffusion transformers, the framework generates images, audio, video, and other modalities at variable resolutions and durations from text prompts with unified architecture. It integrates with Hugging Face Diffusers and supports inference/training workflows including DreamBooth, with pre-trained checkpoints available across multiple model sizes (2B-5B parameters). The approach leverages large transformer models as the core diffusion backbone, enabling compositional generation and multi-modal control capabilities beyond standard text-to-image pipelines.

2,254 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 16 / 25

How are scores calculated?

Stars

2,254

Forks

95

Language

Python

License

MIT

Last pushed

Feb 16, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/Alpha-VLLM/Lumina-T2X"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.