vall-e and VALL-E-X

Both tools are independent PyTorch implementations of Microsoft's VALL-E zero-shot text-to-speech model, making them direct competitors offering alternative open-source reproductions of the same underlying research.

vall-e
51
Established
VALL-E-X
46
Emerging
Maintenance 2/25
Adoption 10/25
Maturity 16/25
Community 23/25
Maintenance 0/25
Adoption 10/25
Maturity 16/25
Community 20/25
Stars: 2,207
Forks: 334
Downloads:
Commits (30d): 0
Language: Python
License: Apache-2.0
Stars: 7,954
Forks: 781
Downloads:
Commits (30d): 0
Language: Python
License: MIT
Stale 6m No Package No Dependents
Archived Stale 6m No Package No Dependents

About vall-e

lifeiteng/vall-e

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html

This project helps create realistic, human-like speech from text. You provide written text and a short audio sample of a person's voice, and it generates that text spoken in the provided voice. This is useful for content creators, audiobook producers, or anyone needing to generate custom speech with specific speaker identities.

speech-synthesis voice-cloning audiobook-production content-creation narration

About VALL-E-X

Plachtaa/VALL-E-X

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/

Supports multilingual synthesis across English, Chinese, and Japanese with emotion and accent control from short acoustic prompts. Uses an autoregressive architecture combining acoustic token prediction with Vocos neural vocoding for high-quality audio reconstruction. Integrates OpenAI's Whisper for speaker embedding extraction and includes Python APIs compatible with PyTorch 2.0+ on CUDA platforms.

Scores updated daily from GitHub, PyPI, and npm data. How scores work