ChatTTS and xiaogpt

A generative speech model can act as a complement to an AI speaker interface, providing the speech synthesis for the speaker to vocalize its responses from a large language model.

ChatTTS

Verified

xiaogpt

Established

Maintenance 10/25

Adoption 21/25

Maturity 25/25

Community 20/25

Maintenance 10/25

Adoption 10/25

Maturity 25/25

Community 22/25

Stars: 38,924

Forks: 4,223

Downloads: 6,452

Commits (30d): 0

Language: Python

License: AGPL-3.0

Stars: 6,796

Forks: 936

Downloads: —

Commits (30d): 0

Language: Python

License: MIT

No risk flags

About ChatTTS

2noise/ChatTTS

A generative speech model for daily dialogue.

Based on the README, here's the technical summary: Built on a transformer architecture trained on 100,000+ hours of multilingual audio, ChatTTS enables fine-grained prosodic control through special tokens for laughter, pauses, and interjections while supporting multiple speakers via speaker embeddings. The model includes a discrete VAE encoder for zero-shot speaker inference and streaming audio generation capabilities, supporting English and Chinese with plans for additional languages.

About xiaogpt

yihong0618/xiaogpt

Play ChatGPT and other LLM with Xiaomi AI Speaker

Supports multiple LLM backends (ChatGPT, Gemini, Claude, local Llama3, etc.) with pluggable TTS engines (Edge, OpenAI, Azure, Fish Audio) and streaming response capabilities. Integrates with Xiaomi's MiService SDK to authenticate and control speakers via the Mina protocol, with optional LangChain support for web search and advanced reasoning tasks. Configuration via YAML/JSON files or CLI arguments, with streaming mode for real-time conversational responsiveness.

Related comparisons

ChatTTS and SpeechGPT ChatTTS and BanterBot ChatTTS and NAOChat ChatTTS and voice-chatgpt-python ChatTTS and gpt-home

Scores updated daily from GitHub, PyPI, and npm data. How scores work