hirofumi0810/neural_sp

End-to-end ASR/LM implementation with PyTorch

51
/ 100
Established

Supports multiple encoder architectures (RNN, Transformer, Conformer, TDS convolution) with streaming variants using monotonic attention mechanisms (MoChA, MMA) for low-latency decoding. Integrates with Kaldi for feature extraction and implements hybrid training strategies including CTC/attention, hierarchical objectives, and language model fusion (shallow/cold/deep fusion). Handles diverse output units (phoneme, grapheme, BPE, word) via multi-task learning across 10+ standard ASR corpora including Librispeech, Switchboard, and AISHELL.

594 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 25 / 25

How are scores calculated?

Stars

594

Forks

136

Language

Python

License

Apache-2.0

Last pushed

Aug 30, 2021

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/hirofumi0810/neural_sp"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.