hirofumi0810/neural_sp
End-to-end ASR/LM implementation with PyTorch
Supports multiple encoder architectures (RNN, Transformer, Conformer, TDS convolution) with streaming variants using monotonic attention mechanisms (MoChA, MMA) for low-latency decoding. Integrates with Kaldi for feature extraction and implements hybrid training strategies including CTC/attention, hierarchical objectives, and language model fusion (shallow/cold/deep fusion). Handles diverse output units (phoneme, grapheme, BPE, word) via multi-task learning across 10+ standard ASR corpora including Librispeech, Switchboard, and AISHELL.
594 stars. No commits in the last 6 months.
Stars
594
Forks
136
Language
Python
License
Apache-2.0
Category
Last pushed
Aug 30, 2021
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/hirofumi0810/neural_sp"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
TensorSpeech/TensorFlowASR
:zap: TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in Tensorflow 2....
xinjli/allosaurus
Allosaurus is a pretrained universal phone recognizer for more than 2000 languages
dangvansam/viet-asr
VietASR - Vietnamese Automatic Speech Recognition
wenet-e2e/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
srvk/eesen
The official repository of the Eesen project