FlashLabs-AI-Corp/FlashLabs-Chroma
Worlds first open-source real-time end-to-end spoken dialogue model with personalized voice cloning.
Built on a hybrid architecture combining Qwen2.5-Omni for reasoning with Llama3-based backbone and decoder layers, Chroma processes raw audio directly and generates synchronized text-speech outputs using Mimi codec at 24kHz sampling. The model supports zero-shot voice cloning by conditioning generation on reference audio prompts, enabling style transfer without task-specific fine-tuning, and integrates with Hugging Face transformers for seamless PyTorch inference with bfloat16 quantization support.
545 stars.
Stars
545
Forks
59
Language
Jupyter Notebook
License
Apache-2.0
Category
Last pushed
Jan 28, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/FlashLabs-AI-Corp/FlashLabs-Chroma"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
OpenBMB/VoxCPM
VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
IAHispano/Applio
A simple, high-quality voice conversion tool focused on ease of use and performance.
JackismyShephard/ultimate-rvc
An app for creating audio-based content such as song covers and speech using Retrieval-based...
codename0og/codename-rvc-fork-4
Codename's rvc fork version 4, based on Applio.
ArkanDash/Advanced-RVC-Inference
Advanced RVC Inference for quicker and effortless model downloads