PKU-YuanGroup/ConsisID
[CVPR 2025 Highlight🔥] Identity-Preserving Text-to-Video Generation by Frequency Decomposition
Decomposes identity features across frequency components to maintain facial consistency without fine-tuning, operating as a plug-and-play module for DiT-based video diffusion models. The approach leverages frequency analysis insights from vision transformers to selectively preserve identity information while allowing stylistic variation. Integrates with Hugging Face Diffusers (v0.33.0+) and supports multiple backbone architectures including CogVideoX series, with optimized inference via TeaCache and xDiT frameworks.
835 stars. Actively maintained with 3 commits in the last 30 days.
Stars
835
Forks
44
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 08, 2026
Commits (30d)
3
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/PKU-YuanGroup/ConsisID"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
hao-ai-lab/FastVideo
A unified inference and post-training framework for accelerated video generation.
thu-ml/TurboDiffusion
TurboDiffusion: 100–200× Acceleration for Video Diffusion Models
PKU-YuanGroup/Helios
Helios: Real Real-Time Long Video Generation Model
ModelTC/LightX2V
Light Image Video Generation Inference Framework
Lightricks/LTX-Video
Official repository for LTX-Video