hirofumi0810/neural_sp

End-to-end ASR/LM implementation with PyTorch

/ 100

Established

Supports multiple encoder architectures (RNN, Transformer, Conformer, TDS convolution) with streaming variants using monotonic attention mechanisms (MoChA, MMA) for low-latency decoding. Integrates with Kaldi for feature extraction and implements hybrid training strategies including CTC/attention, hierarchical objectives, and language model fusion (shallow/cold/deep fusion). Handles diverse output units (phoneme, grapheme, BPE, word) via multi-task learning across 10+ standard ASR corpora including Librispeech, Switchboard, and AISHELL.

594 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 25 / 25

How are scores calculated?

Stars

594

Forks

136

Language

Python

License

Apache-2.0

Related tools

TensorSpeech/TensorFlowASR

:zap: TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in Tensorflow 2....

xinjli/allosaurus

Allosaurus is a pretrained universal phone recognizer for more than 2000 languages

dangvansam/viet-asr

VietASR - Vietnamese Automatic Speech Recognition

wenet-e2e/wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

srvk/eesen

The official repository of the Eesen project

Explore Voice AI Tools

All categories Trending Voice AI directory Insights