PKU-YuanGroup/ConsisID

[CVPR 2025 Highlight🔥] Identity-Preserving Text-to-Video Generation by Frequency Decomposition

/ 100

Emerging

Decomposes identity features across frequency components to maintain facial consistency without fine-tuning, operating as a plug-and-play module for DiT-based video diffusion models. The approach leverages frequency analysis insights from vision transformers to selectively preserve identity information while allowing stylistic variation. Integrates with Hugging Face Diffusers (v0.33.0+) and supports multiple backbone architectures including CogVideoX series, with optimized inference via TeaCache and xDiT frameworks.

835 stars. Actively maintained with 3 commits in the last 30 days.

No Package No Dependents

Maintenance 16 / 25

Adoption 10 / 25

Maturity 9 / 25

Community 14 / 25

How are scores calculated?

Stars

835

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

hao-ai-lab/FastVideo

A unified inference and post-training framework for accelerated video generation.

thu-ml/TurboDiffusion

TurboDiffusion: 100–200× Acceleration for Video Diffusion Models

PKU-YuanGroup/Helios

Helios: Real Real-Time Long Video Generation Model

ModelTC/LightX2V

Light Image Video Generation Inference Framework

Lightricks/LTX-Video

Official repository for LTX-Video

Explore Diffusion Models

All categories Trending Diffusion directory Insights