Mobile-Artificial-Intelligence/babylon
Babylon.cpp is a C and C++ library for grapheme to phoneme conversion and text to speech synthesis. For phonemization a ONNX runtime port of the DeepPhonemizer model is used. For speech synthesis VITS models are used. Piper models are compatible after a conversion script is run.
Supports multi-platform deployment across Linux, macOS, Windows, and Android with unified ONNX Runtime inference for offline operation. Provides three access layers—C/C++ library APIs, a CLI tool with phonemization and synthesis subcommands, and an embeddable REST server with web UI—enabling integration from embedded systems to web services. Features the high-quality Kokoro engine (24 kHz, 54+ voices) alongside VITS for model flexibility, with pronunciation backed by a 130k-entry dictionary.
Stars
30
Forks
3
Language
Python
License
MIT
Category
Last pushed
Mar 09, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/Mobile-Artificial-Intelligence/babylon"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
pnnbao97/VieNeu-TTS
Vietnamese TTS with instant voice cloning • On-device • Real-time CPU inference • 24kHz audio...
r9y9/nnmnkwii
Library to build speech synthesis systems designed for easy and fast prototyping.
CorentinJ/Real-Time-Voice-Cloning
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Softcatala/open-dubbing
Open dubbing is an AI dubbing system which uses machine learning models to automatically...
babysor/MockingBird
🚀Clone a voice in 5 seconds to generate arbitrary speech in real-time