NickZaitsev/ru-normalizr
ru-normalizr — лучший open-source нормализатор русского текста. Приводит числа, даты, время, сокращения, римские цифры, символы и латиницу в русские буквы для использования в TTS и NLP.
Implements rule-based morphological transformation with a modular pipeline architecture, enabling selective normalization of numerals, dates, abbreviations, and transliteration while maintaining grammatical correctness—not simple dictionary substitution. Available as a PyPI package with CLI/GUI interfaces and optional IPA-backed latinization with stress marking; designed for TTS and NLP preprocessing without GPU dependency.
Available on PyPI.
Stars
8
Forks
1
Language
Python
License
MIT
Category
Last pushed
Mar 16, 2026
Commits (30d)
0
Dependencies
5
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/NickZaitsev/ru-normalizr"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
speechio/chinese_text_normalization
Chinese text normalization for speech processing
repodiac/german_transliterate
Python module to clean and transliterate (i.e. normalize) German text including abbreviations,...
gladiaio/normalization
A lightweight library for normalizing speech transcripts before computing WER
google-research-datasets/TextNormalizationCoveringGrammars
Covering grammars for English and Russian text normalization
34j/mecab-text-cleaner
Simple Python package (CLI/Python API) for getting japanese readings (yomigana) and accents using MeCab.