Fantasy-AMAP/fantasy-talking

[ACM MM 2025] FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis

/ 100

Established

Synthesizes audio-driven talking portraits by conditioning the Wan2.1 diffusion video model with wav2vec2 audio embeddings and optional motion prompts for controllable gesture generation. Achieves efficient inference through persistent parameter optimization, reducing memory requirements from 40GB to 5GB while supporting both Hugging Face and ModelScope ecosystem integrations. Provides a Gradio interface and ComfyUI node wrapper for accessible deployment.

1,622 stars.

No Package No Dependents

Maintenance 10 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 19 / 25

How are scores calculated?

Stars

1,622

Forks

126

Language

Python

License

Apache-2.0

Related models

hao-ai-lab/FastVideo

A unified inference and post-training framework for accelerated video generation.

thu-ml/TurboDiffusion

TurboDiffusion: 100–200× Acceleration for Video Diffusion Models

ModelTC/LightX2V

Light Image Video Generation Inference Framework

PKU-YuanGroup/Helios

Helios: Real Real-Time Long Video Generation Model

PKU-YuanGroup/MagicTime

[TPAMI 2025🔥] MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators

Explore Diffusion Models

All categories Trending Diffusion directory Insights