vall-e and VALL-E-X

Both tools are independent PyTorch implementations of Microsoft's VALL-E zero-shot text-to-speech model, making them direct competitors offering alternative open-source reproductions of the same underlying research.

vall-e

Established

VALL-E-X

Emerging

Maintenance 2/25

Adoption 10/25

Maturity 16/25

Community 23/25

Maintenance 0/25

Adoption 10/25

Maturity 16/25

Community 20/25

Stars: 2,207

Forks: 334

Downloads: —

Commits (30d): 0

Language: Python

License: Apache-2.0

Stars: 7,954

Forks: 781

Downloads: —

Commits (30d): 0

Language: Python

License: MIT

Stale 6m No Package No Dependents

Archived Stale 6m No Package No Dependents

About vall-e

lifeiteng/vall-e

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html

This project helps create realistic, human-like speech from text. You provide written text and a short audio sample of a person's voice, and it generates that text spoken in the provided voice. This is useful for content creators, audiobook producers, or anyone needing to generate custom speech with specific speaker identities.

speech-synthesis voice-cloning audiobook-production content-creation narration

About VALL-E-X

Plachtaa/VALL-E-X

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/

Supports multilingual synthesis across English, Chinese, and Japanese with emotion and accent control from short acoustic prompts. Uses an autoregressive architecture combining acoustic token prediction with Vocos neural vocoding for high-quality audio reconstruction. Integrates OpenAI's Whisper for speaker embedding extraction and includes Python APIs compatible with PyTorch 2.0+ on CUDA platforms.

Scores updated daily from GitHub, PyPI, and npm data. How scores work