HunyuanVideo and HunyuanCustom

HunyuanVideo
52
Established
HunyuanCustom
49
Emerging
Maintenance 6/25
Adoption 10/25
Maturity 16/25
Community 20/25
Maintenance 6/25
Adoption 10/25
Maturity 15/25
Community 18/25
Stars: 11,847
Forks: 1,209
Downloads:
Commits (30d): 0
Language: Python
License:
Stars: 1,211
Forks: 108
Downloads:
Commits (30d): 0
Language: Python
License:
No Package No Dependents
No Package No Dependents

About HunyuanVideo

Tencent-Hunyuan/HunyuanVideo

HunyuanVideo: A Systematic Framework For Large Video Generation Model

Employs a unified diffusion architecture for both image and video generation using a multimodal language model text encoder and 3D VAE for efficient spatiotemporal compression. Integrates with HuggingFace Diffusers and supports multi-GPU sequence parallel inference via xDiT for accelerated generation, with quantized FP8 weights for reduced memory overhead. Includes a prompt rewriting module to enhance text-to-video quality and extends to specialized variants for image-to-video, audio-driven animation, and custom video synthesis tasks.

About HunyuanCustom

Tencent-Hunyuan/HunyuanCustom

HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation

Supports subject-consistent video generation from multimodal inputs—text, images, audio, and video—through specialized injection modules including a text-image fusion layer based on LLaVA, an AudioNet for hierarchical audio alignment, and a video-driven patchify-based feature encoder. Built on HunyuanVideo, it enables downstream applications like virtual avatars, singing synthesis, and video object replacement while maintaining identity consistency across frames. Integrates with ComfyUI and HuggingFace, with optimized inference available for 8GB single-GPU setups.

Related comparisons

Scores updated daily from GitHub, PyPI, and npm data. How scores work