donahowe/AutoStudio

AutoStudio: Crafting Consistent Subjects in Multi-turn Interactive Image Generation

/ 100

Emerging

Employs a training-free multi-agent LLM framework coordinating subject management, spatial layout generation, and refinement supervision alongside a Parallel-UNet diffusion model with dual cross-attention modules for consistent multi-subject image generation. Integrates Stable Diffusion (v1.5 and SDXL), IP-Adapter, Grounding-DINO, and EfficientSAM for dialogue-driven interactive generation across multiple turns while maintaining subject identity and spatial coherence.

449 stars. No commits in the last 6 months.

No License Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 10 / 25

Maturity 8 / 25

Community 13 / 25

How are scores calculated?

Stars

449

Forks

Language

Jupyter Notebook

License

—

Higher-rated alternatives

open-mmlab/FoleyCrafter

[IJCV] FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds....

kyegomez/Sora

Implementation of the premier Text to Video model from OpenAI

ShubhamZade1997/Pixelle-Video

🎥 Create stunning short videos effortlessly with Pixelle-Video, an AI-driven engine designed for...

CharlieDreemur/AI-Video-Converter

AI Video Converter Based on ControlNet

abrahamjroy/ProcessedElectricSheepDreams

A native Application , With Agentic Support (MCP) for ultra fast AI image generation using a...

Explore Diffusion Models

All categories Trending Diffusion directory Insights