gladiaio/normalization

A lightweight library for normalizing speech transcripts before computing WER

41
/ 100
Emerging

Implements a **three-stage deterministic pipeline** (text pre-processing → word processing → text post-processing) where steps are declaratively composed in YAML presets, with built-in language packs for English and French. Handles domain-specific transformations like number-to-word conversion, contraction expansion, and time/currency formatting through registered step classes that protect and restore placeholders across stages. Designed for ASR benchmarking workflows where WER computation requires canonical forms of semantically equivalent transcriptions.

No Package No Dependents
Maintenance 13 / 25
Adoption 5 / 25
Maturity 9 / 25
Community 14 / 25

How are scores calculated?

Stars

10

Forks

3

Language

Python

License

MIT

Last pushed

Mar 23, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/gladiaio/normalization"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.