FluidInference/FluidAudio

Frontier CoreML audio models in your apps — text-to-speech, speech-to-text, voice activity detection, and speaker diarization. In Swift, powered by SOTA open source.

65
/ 100
Established

Inference offloads to the Apple Neural Engine (ANE) for minimal CPU/GPU usage and optimized battery performance on always-on workloads. Includes streaming ASR with end-of-utterance detection, inverse text normalization for post-processing, and both online/offline speaker diarization pipelines with advanced clustering. All models are open-source (MIT/Apache 2.0) from HuggingFace, supporting 25 languages for transcription and 9 for TTS, with straightforward Swift integration.

1,689 stars. Actively maintained with 90 commits in the last 30 days.

No Package No Dependents
Maintenance 25 / 25
Adoption 10 / 25
Maturity 9 / 25
Community 21 / 25

How are scores calculated?

Stars

1,689

Forks

214

Language

Swift

License

Apache-2.0

Last pushed

Mar 18, 2026

Commits (30d)

90

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/FluidInference/FluidAudio"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.