Tencent-Hunyuan/HunyuanCustom
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation
Supports subject-consistent video generation from multimodal inputs—text, images, audio, and video—through specialized injection modules including a text-image fusion layer based on LLaVA, an AudioNet for hierarchical audio alignment, and a video-driven patchify-based feature encoder. Built on HunyuanVideo, it enables downstream applications like virtual avatars, singing synthesis, and video object replacement while maintaining identity consistency across frames. Integrates with ComfyUI and HuggingFace, with optimized inference available for 8GB single-GPU setups.
1,211 stars.
Stars
1,211
Forks
108
Language
Python
License
—
Category
Last pushed
Oct 15, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/Tencent-Hunyuan/HunyuanCustom"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Higher-rated alternatives
hao-ai-lab/FastVideo
A unified inference and post-training framework for accelerated video generation.
thu-ml/TurboDiffusion
TurboDiffusion: 100–200× Acceleration for Video Diffusion Models
ModelTC/LightX2V
Light Image Video Generation Inference Framework
PKU-YuanGroup/Helios
Helios: Real Real-Time Long Video Generation Model
PKU-YuanGroup/MagicTime
[TPAMI 2025🔥] MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators