ChatTTS and SpeechGPT
One is a foundational speech synthesis model, while the other is a client-side integration of ChatGPT with voice interaction capabilities, making them complements where the latter could potentially utilize the former's speech generation for its output.
About ChatTTS
2noise/ChatTTS
A generative speech model for daily dialogue.
Based on the README, here's the technical summary: Built on a transformer architecture trained on 100,000+ hours of multilingual audio, ChatTTS enables fine-grained prosodic control through special tokens for laughter, pauses, and interjections while supporting multiple speakers via speaker embeddings. The model includes a discrete VAE encoder for zero-shot speaker inference and streaming audio generation capabilities, supporting English and Chinese with plans for additional languages.
About SpeechGPT
Jdka1/SpeechGPT
Free ChatGPT voice interaction and integration into python workflows.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work