Fantasy-AMAP/fantasy-talking
[ACM MM 2025] FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
Synthesizes audio-driven talking portraits by conditioning the Wan2.1 diffusion video model with wav2vec2 audio embeddings and optional motion prompts for controllable gesture generation. Achieves efficient inference through persistent parameter optimization, reducing memory requirements from 40GB to 5GB while supporting both Hugging Face and ModelScope ecosystem integrations. Provides a Gradio interface and ComfyUI node wrapper for accessible deployment.
1,622 stars.
Stars
1,622
Forks
126
Language
Python
License
Apache-2.0
Category
Last pushed
Jan 26, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/Fantasy-AMAP/fantasy-talking"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related models
hao-ai-lab/FastVideo
A unified inference and post-training framework for accelerated video generation.
thu-ml/TurboDiffusion
TurboDiffusion: 100–200× Acceleration for Video Diffusion Models
ModelTC/LightX2V
Light Image Video Generation Inference Framework
PKU-YuanGroup/Helios
Helios: Real Real-Time Long Video Generation Model
PKU-YuanGroup/MagicTime
[TPAMI 2025🔥] MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators