Blaizzy/mlx-audio-swift
A modular Swift SDK for audio processing with MLX on Apple Silicon
Provides modular audio AI capabilities spanning text-to-speech, speech-to-text, voice activity detection, speaker diarization, and speech enhancement via MLX inference on Apple Silicon. Built as composable Swift packages with streaming support and automatic HuggingFace model loading, it integrates codecs (SNAC, Encodec, Vocos) and supports multiple model families (Qwen3, Fish Audio, Soprano, Voxtral, Sortformer) with native async/await APIs.
446 stars.
Stars
446
Forks
56
Language
Swift
License
MIT
Category
Last pushed
Mar 17, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/Blaizzy/mlx-audio-swift"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
k2-fsa/sherpa-ncnn
Real-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn...
FluidInference/FluidAudio
Frontier CoreML audio models in your apps — text-to-speech, speech-to-text, voice activity...
phuc-nt/my-translator
Real-time speech translation — macOS & Windows, free TTS, no server, your API keys only
pot-app/pot-desktop
🌈一个跨平台的划词翻译和OCR软件 | A cross-platform software for text translation and recognition.
soniqo/speech-swift
AI speech toolkit for Apple Silicon — ASR, TTS, speech-to-speech, VAD, and diarization powered...