gentaiscool/end2end-asr-pytorch

End-to-End Automatic Speech Recognition on PyTorch

/ 100

Emerging

Implements a low-rank Transformer encoder-decoder architecture with optional CNN feature extractors (VGG or embedding-based) for sequence-to-sequence ASR. Supports multi-GPU batch parallelization and flexible training across multiple datasets with configurable beam-search decoding, achieving 13.5% CER on Mandarin Chinese with beam width 8. Compatible with PyTorch and torchaudio, accepts custom datasets via CSV manifests with character-level labeling.

304 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 22 / 25

How are scores calculated?

Stars

304

Forks

Language

Python

License

MIT

Compare

end2end-asr-pytorch and OpenASR

Higher-rated alternatives

TensorSpeech/TensorFlowASR

:zap: TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in Tensorflow 2....

xinjli/allosaurus

Allosaurus is a pretrained universal phone recognizer for more than 2000 languages

dangvansam/viet-asr

VietASR - Vietnamese Automatic Speech Recognition

wenet-e2e/wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

srvk/eesen

The official repository of the Eesen project

Explore Voice AI Tools

All categories Trending Voice AI directory Insights