DiffGesture and DiffuseStyleGesture

Both tools address audio-driven co-speech gesture generation using diffusion models but with different focus: DiffGesture emphasizes the core diffusion-based generation approach while DiffuseStyleGesture extends it with explicit style control, making them complementary techniques that could be combined rather than direct competitors.

DiffGesture

Established

DiffuseStyleGesture

Established

Maintenance 13/25

Adoption 10/25

Maturity 16/25

Community 13/25

Maintenance 6/25

Adoption 10/25

Maturity 16/25

Community 18/25

Stars: 261

Forks: 19

Downloads: —

Commits (30d): 0

Language: Python

License: GPL-3.0

Stars: 206

Forks: 31

Downloads: —

Commits (30d): 0

Language: Python

License: MIT

No Package No Dependents

About DiffGesture

Advocate99/DiffGesture

[CVPR'2023] Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation

Employs a Diffusion Audio-Gesture Transformer architecture to jointly model cross-modal audio-to-skeleton associations while preserving temporal coherence through an annealed noise sampling strategy. Integrates classifier-free guidance for diversity-quality trade-offs and uses pretrained autoencoders (from HA2G) for perceptual metrics on TED Gesture and TED Expressive datasets. Supports both short/long video synthesis with skeleton sequence generation conditioned on audio input.

About DiffuseStyleGesture

YoungSeng/DiffuseStyleGesture

DiffuseStyleGesture: Stylized Audio-Driven Co-Speech Gesture Generation with Diffusion Models (IJCAI 2023) | The DiffuseStyleGesture+ entry to the GENEA Challenge 2023 (ICMI 2023, Reproducibility Award)

Leverages diffusion models with WavLM audio embeddings to generate stylized full-body gestures conditioned on speech, supporting controllable style and intensity parameters. The architecture uses LMDB-based training pipelines on mocap datasets (ZEGGS, BEAT, TWH) and outputs motion in BVH format compatible with Blender visualization. Implements motion matching variants (QPGesture) and multi-dataset training (UnifiedGesture) as downstream extensions, with pre-trained checkpoints available for inference.

Scores updated daily from GitHub, PyPI, and npm data. How scores work