holgern/kokorog2p
A unified multi-language G2P (Grapheme-to-Phoneme) library for Kokoro TTS.
Supports 11 languages through a hybrid approach combining large dictionary lookups (179k+ entries for English) with rule-based phonemization and espeak-ng fallback for out-of-vocabulary words. Includes optional language detection for mixed-language text preprocessing, configurable memory footprint via selective dictionary loading, and automatic punctuation normalization with context-aware abbreviation expansion using spaCy POS tagging.
3 stars and 1,700 monthly downloads. Used by 2 other packages. Available on PyPI.
Stars
3
Forks
—
Language
Python
License
Apache-2.0
Category
Last pushed
Feb 12, 2026
Monthly downloads
1,700
Commits (30d)
0
Reverse dependents
2
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/holgern/kokorog2p"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
thewh1teagle/phonikud
Hebrew grapheme to phoneme (G2P)
GitYCC/g2pW
Chinese Mandarin Grapheme-to-Phoneme Converter. 中文轉注音或拼音 (INTERSPEECH 2022)
Wikidepia/g2p-id
Indonesian Grapheme-to-Phoneme (IPA notation)
stefantaubert/pinyin-to-ipa
Command-line interface and Python library to transcribe pinyin to IPA. The tones are attached to...
pnnbao97/sea-g2p
Fast multilingual text-to-phoneme converter for South East Asian languages.