ChatTTS and xiaogpt

A generative speech model can act as a complement to an AI speaker interface, providing the speech synthesis for the speaker to vocalize its responses from a large language model.

ChatTTS
76
Verified
xiaogpt
67
Established
Maintenance 10/25
Adoption 21/25
Maturity 25/25
Community 20/25
Maintenance 10/25
Adoption 10/25
Maturity 25/25
Community 22/25
Stars: 38,924
Forks: 4,223
Downloads: 6,452
Commits (30d): 0
Language: Python
License: AGPL-3.0
Stars: 6,796
Forks: 936
Downloads:
Commits (30d): 0
Language: Python
License: MIT
No risk flags
No risk flags

About ChatTTS

2noise/ChatTTS

A generative speech model for daily dialogue.

Based on the README, here's the technical summary: Built on a transformer architecture trained on 100,000+ hours of multilingual audio, ChatTTS enables fine-grained prosodic control through special tokens for laughter, pauses, and interjections while supporting multiple speakers via speaker embeddings. The model includes a discrete VAE encoder for zero-shot speaker inference and streaming audio generation capabilities, supporting English and Chinese with plans for additional languages.

About xiaogpt

yihong0618/xiaogpt

Play ChatGPT and other LLM with Xiaomi AI Speaker

Supports multiple LLM backends (ChatGPT, Gemini, Claude, local Llama3, etc.) with pluggable TTS engines (Edge, OpenAI, Azure, Fish Audio) and streaming response capabilities. Integrates with Xiaomi's MiService SDK to authenticate and control speakers via the Mina protocol, with optional LangChain support for web search and advanced reasoning tasks. Configuration via YAML/JSON files or CLI arguments, with streaming mode for real-time conversational responsiveness.

Scores updated daily from GitHub, PyPI, and npm data. How scores work