FluidInference/FluidAudio

Frontier CoreML audio models in your apps — text-to-speech, speech-to-text, voice activity detection, and speaker diarization. In Swift, powered by SOTA open source.

/ 100

Established

Inference offloads to the Apple Neural Engine (ANE) for minimal CPU/GPU usage and optimized battery performance on always-on workloads. Includes streaming ASR with end-of-utterance detection, inverse text normalization for post-processing, and both online/offline speaker diarization pipelines with advanced clustering. All models are open-source (MIT/Apache 2.0) from HuggingFace, supporting 25 languages for transcription and 9 for TTS, with straightforward Swift integration.

1,689 stars. Actively maintained with 90 commits in the last 30 days.

No Package No Dependents

Maintenance 25 / 25

Adoption 10 / 25

Maturity 9 / 25

Community 21 / 25

How are scores calculated?

Stars

1,689

Forks

214

Language

Swift

License

Apache-2.0

Compare

FluidAudio and speech-swift

Related tools

k2-fsa/sherpa-ncnn

Real-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn...

phuc-nt/my-translator

Real-time speech translation — macOS & Windows, free TTS, no server, your API keys only

pot-app/pot-desktop

🌈一个跨平台的划词翻译和OCR软件 | A cross-platform software for text translation and recognition.

Blaizzy/mlx-audio-swift

A modular Swift SDK for audio processing with MLX on Apple Silicon

soniqo/speech-swift

AI speech toolkit for Apple Silicon — ASR, TTS, speech-to-speech, VAD, and diarization powered...

Explore Voice AI Tools

All categories Trending Voice AI directory Insights