Blaizzy/mlx-audio

A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.

/ 100

Verified

Supports multilingual voice cloning and style transfer through models like CSM and Ming Omni, with adjustable quantization (3-8 bit) for memory-efficient inference on Apple Silicon. Built on MLX framework with streaming audio generation, an OpenAI-compatible REST API, and a Swift package for native iOS/macOS integration. Includes speaker diarization, forced alignment, and multimodal audio understanding capabilities across 10+ supported TTS/STT architectures.

6,227 stars and 79,203 monthly downloads. Used by 4 other packages. Actively maintained with 157 commits in the last 30 days. Available on PyPI.

Maintenance 25 / 25

Adoption 24 / 25

Maturity 25 / 25

Community 19 / 25

How are scores calculated?

Stars

6,227

Forks

486

Language

Python

License

MIT

Featured in

Things AI Won't Tell You About Building a Voice App Choosing a Voice AI Library in 2026: What's Actually Worth Building On

Related tools

fishaudio/fish-speech

SOTA Open Source TTS

lenML/Speech-AI-Forge

🍦 Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server...

mlalma/kokoro-ios

Kokoro TTS for iOS and macOSX

sidharthrajaram/StyleTTS2

🐍 🤖 Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloning

mlalma/KokoroTestApp

Test application for Kokoro TTS model

Explore Voice AI Tools

All categories Trending Voice AI directory Insights