kurianbenoy/whisper_normalizer
A python package for whisper normalizer
Implements OpenAI's Whisper normalization algorithm for standardizing ASR transcription output, reducing spurious WER/CER penalties from formatting differences. Provides specialized normalizers for English and Indic languages (Malayalam, Hindi, Tamil, etc.), addressing limitations where basic normalization degrades low-resource language text. Derives Indic logic from indic-nlp-library with extended Malayalam support for script-specific canonicalization.
76 stars and 414,981 monthly downloads. Used by 2 other packages. No commits in the last 6 months. Available on PyPI.
Stars
76
Forks
17
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Oct 06, 2025
Monthly downloads
414,981
Commits (30d)
0
Dependencies
3
Reverse dependents
2
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/kurianbenoy/whisper_normalizer"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
collabora/WhisperLive
A nearly-live implementation of OpenAI's Whisper.
Softcatala/whisper-ctranslate2
Whisper command line client compatible with original OpenAI client based on CTranslate2.
Kieirra/murmure
Fully local, private and cross platform Speech-to-Text with LLM Post-processing
ahmetoner/whisper-asr-webservice
OpenAI Whisper ASR Webservice API
pavelzbornik/whisperX-FastAPI
FastAPI service on top of WhisperX