Blaizzy/mlx-audio
A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.
Supports multilingual voice cloning and style transfer through models like CSM and Ming Omni, with adjustable quantization (3-8 bit) for memory-efficient inference on Apple Silicon. Built on MLX framework with streaming audio generation, an OpenAI-compatible REST API, and a Swift package for native iOS/macOS integration. Includes speaker diarization, forced alignment, and multimodal audio understanding capabilities across 10+ supported TTS/STT architectures.
6,227 stars and 79,203 monthly downloads. Used by 4 other packages. Actively maintained with 157 commits in the last 30 days. Available on PyPI.
Stars
6,227
Forks
486
Language
Python
License
MIT
Category
Last pushed
Mar 12, 2026
Monthly downloads
79,203
Commits (30d)
157
Dependencies
12
Reverse dependents
4
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/Blaizzy/mlx-audio"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Featured in
Related tools
fishaudio/fish-speech
SOTA Open Source TTS
lenML/Speech-AI-Forge
🍦 Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server...
mlalma/kokoro-ios
Kokoro TTS for iOS and macOSX
sidharthrajaram/StyleTTS2
🐍 🤖 Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloning
mlalma/KokoroTestApp
Test application for Kokoro TTS model