freewym/espresso
Espresso: A Fast End-to-End Neural Speech Recognition Toolkit
Built on PyTorch and fairseq, Espresso provides modular architectures including Transformer, Conformer, and Transducer models with support for both end-to-end and hybrid ASR training. It features distributed multi-GPU/node training, parallelized beam search decoding with integrated language model fusion, and on-the-fly feature extraction from raw waveforms via torchaudio. Recipes for WSJ, LibriSpeech, and Switchboard datasets demonstrate support for CTC, LF-MMI, and cross-entropy training objectives.
940 stars. No commits in the last 6 months.
Stars
940
Forks
116
Language
Python
License
—
Category
Last pushed
Sep 04, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/freewym/espresso"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
TensorSpeech/TensorFlowASR
:zap: TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in Tensorflow 2....
xinjli/allosaurus
Allosaurus is a pretrained universal phone recognizer for more than 2000 languages
dangvansam/viet-asr
VietASR - Vietnamese Automatic Speech Recognition
wenet-e2e/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
srvk/eesen
The official repository of the Eesen project