Blaizzy/mlx-audio

A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.

93
/ 100
Verified

Supports multilingual voice cloning and style transfer through models like CSM and Ming Omni, with adjustable quantization (3-8 bit) for memory-efficient inference on Apple Silicon. Built on MLX framework with streaming audio generation, an OpenAI-compatible REST API, and a Swift package for native iOS/macOS integration. Includes speaker diarization, forced alignment, and multimodal audio understanding capabilities across 10+ supported TTS/STT architectures.

6,227 stars and 79,203 monthly downloads. Used by 4 other packages. Actively maintained with 157 commits in the last 30 days. Available on PyPI.

Maintenance 25 / 25
Adoption 24 / 25
Maturity 25 / 25
Community 19 / 25

How are scores calculated?

Stars

6,227

Forks

486

Language

Python

License

MIT

Last pushed

Mar 12, 2026

Monthly downloads

79,203

Commits (30d)

157

Dependencies

12

Reverse dependents

4

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/Blaizzy/mlx-audio"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.