End-to-End ASR Frameworks Voice AI Tools
PyTorch-based implementations of complete automatic speech recognition systems with integrated acoustic modeling, feature extraction, and decoding. Does NOT include ASR evaluation metrics, language models, individual components (vocoder, G2P), or non-PyTorch frameworks like Kaldi-only solutions.
There are 109 end-to-end asr frameworks tools tracked. 7 score above 50 (established tier). The highest-rated is TensorSpeech/TensorFlowASR at 69/100 with 1,005 stars and 930 monthly downloads.
Get all 109 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=voice-ai&subcategory=end-to-end-asr-frameworks&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
TensorSpeech/TensorFlowASR
:zap: TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in... |
|
Established |
| 2 |
xinjli/allosaurus
Allosaurus is a pretrained universal phone recognizer for more than 2000 languages |
|
Established |
| 3 |
dangvansam/viet-asr
VietASR - Vietnamese Automatic Speech Recognition |
|
Established |
| 4 |
wenet-e2e/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit |
|
Established |
| 5 |
srvk/eesen
The official repository of the Eesen project |
|
Established |
| 6 |
sooftware/kospeech
Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition... |
|
Established |
| 7 |
hirofumi0810/neural_sp
End-to-end ASR/LM implementation with PyTorch |
|
Established |
| 8 |
Audio-WestlakeU/VINP
Official PyTorch implementation of 'VINP: Variational Bayesian Inference... |
|
Emerging |
| 9 |
yl4579/AuxiliaryASR
Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment) |
|
Emerging |
| 10 |
openspeech-team/openspeech
Open-Source Toolkit for End-to-End Speech Recognition leveraging... |
|
Emerging |
| 11 |
gentaiscool/end2end-asr-pytorch
End-to-End Automatic Speech Recognition on PyTorch |
|
Emerging |
| 12 |
clovaai/ClovaCall
ClovaCall dataset and Pytorch LAS baseline code (Interspeech 2020) |
|
Emerging |
| 13 |
iamjanvijay/rnnt_decoder_cuda
An efficient implementation of RNN-T Prefix Beam Search in C++/CUDA. |
|
Emerging |
| 14 |
voicekit-team/T-one
T-one is a high-performance streaming ASR pipeline for Russian, specialized... |
|
Emerging |
| 15 |
freewym/espresso
Espresso: A Fast End-to-End Neural Speech Recognition Toolkit |
|
Emerging |
| 16 |
George0828Zhang/torch_cif
A fast parallel PyTorch implementation of the "CIF: Continuous... |
|
Emerging |
| 17 |
by2101/OpenASR
A pytorch based end2end speech recognition system. |
|
Emerging |
| 18 |
theblackcat102/edgedict
Working online speech recognition based on RNN Transducer. ( Trained model... |
|
Emerging |
| 19 |
hirofumi0810/asr_preprocessing
Python implementation of pre-processing for End-to-End speech recognition |
|
Emerging |
| 20 |
upskyy/Transformer-Transducer
PyTorch implementation of "Transformer Transducer: A Streamable Speech... |
|
Emerging |
| 21 |
R1ckShi/AESRC2020
[ICASSP2021] Data preperation scripts, training pipeline and baseline... |
|
Emerging |
| 22 |
ryanleary/patter
speech-to-text in pytorch |
|
Emerging |
| 23 |
kaituoxu/Speech-Transformer
A PyTorch implementation of Speech Transformer, an End-to-End ASR with... |
|
Emerging |
| 24 |
nobody132/masr
中文语音识别; Mandarin Automatic Speech Recognition; |
|
Emerging |
| 25 |
jinserk/pytorch-asr
ASR with PyTorch |
|
Emerging |
| 26 |
charlesliucn/awesome-end2end-asr
💬 A list of End-to-End speech recognition, including papers, codes and other... |
|
Emerging |
| 27 |
pika-online/AESRC2020
a deep accent recognition network |
|
Emerging |
| 28 |
declare-lab/speech-adapters
Codes and datasets for our ICASSP2023 paper, Evaluating parameter-efficient... |
|
Emerging |
| 29 |
awslabs/speech-representations
Code for DeCoAR (ICASSP 2020) and BERTphone (Odyssey 2020) |
|
Emerging |
| 30 |
zh217/torch-asg
Auto Segmentation Criterion (ASG) implemented in pytorch |
|
Emerging |
| 31 |
tugstugi/mongolian-speech-recognition
Mongolian speech recognition with PyTorch |
|
Emerging |
| 32 |
1ytic/pytorch-edit-distance
Levenshtein edit-distance on PyTorch and CUDA |
|
Emerging |
| 33 |
sooftware/speech-transformer
Transformer implementation speciaized in speech recognition tasks using Pytorch. |
|
Emerging |
| 34 |
tabahi/contexless-phonemes-CUPE
pytorch model for contexless-phoneme prediction from speech audio |
|
Emerging |
| 35 |
1ytic/open_stt_e2e
PyTorch end-to-end speech recognition |
|
Emerging |
| 36 |
VITA-Group/Audio-Lottery
[ICLR 2022] "Audio Lottery: Speech Recognition Made Ultra-Lightweight,... |
|
Emerging |
| 37 |
xingchensong/Speech-Transformer-tf2.0
transformer for ASR-systerm (via tensorflow2.0) |
|
Emerging |
| 38 |
manhph2211/ViSR
This repo builds an end-to-end deep learning application that supports... |
|
Emerging |
| 39 |
HawkAaron/E2E-ASR
PyTorch Implementations for End-to-End Automatic Speech Recognition |
|
Emerging |
| 40 |
HawkAaron/RNN-Transducer
MXNet implementation of RNN Transducer (Graves 2012): Sequence Transduction... |
|
Emerging |
| 41 |
stevenhillis/awesome-asr-contextualization
A curated list of awesome papers on contextualizing E2E ASR outputs |
|
Emerging |
| 42 |
audioku/cross-accent-maml-asr
Meta-learning model agnostic (MAML) implementation for cross-accented ASR |
|
Emerging |
| 43 |
Sundy1219/eesen-for-thchs30
ASR for Chinese Mandarin |
|
Emerging |
| 44 |
GinoShun/Accent-Activation-Steering
Official code for "Activation Steering for Accent Adaptation in Speech... |
|
Emerging |
| 45 |
sooftware/lightning-asr
Modular and extensible speech recognition library leveraging... |
|
Emerging |
| 46 |
vectominist/MiniASR
A mini, simple, and fast end-to-end automatic speech recognition toolkit. |
|
Emerging |
| 47 |
vectominist/spin
Official code for Interspeech 2023 paper "Self-supervised Fine-tuning for... |
|
Emerging |
| 48 |
MingLunHan/CIF-PyTorch
[ICASSP 2020] CIF: Continuous Integrate-and-Fire for End-to-End Speech... |
|
Emerging |
| 49 |
sooftware/End-to-End-Speech-Recognition-Models
PyTorch implementation of automatic speech recognition models. |
|
Emerging |
| 50 |
cdyangbo/end2endASR
implement end-to-end asr algorithm with tensorflow |
|
Emerging |
| 51 |
jiwidi/DeepSpeech-pytorch
Pytorch implementation for DeepSpeech 2.0 |
|
Emerging |
| 52 |
jindongwang/EasyEspnet
Making Espnet easier to use |
|
Emerging |
| 53 |
RF5/transfusion-asr
Transcribing Speech with Multinomial Diffusion, training code and models. |
|
Emerging |
| 54 |
mravanelli/pytorch_MLP_for_ASR
This code implements a basic MLP for speech recognition. The MLP is trained... |
|
Emerging |
| 55 |
biyoml/End-to-End-Mandarin-ASR
End-to-end speech recognition on AISHELL dataset. |
|
Emerging |
| 56 |
DataXujing/ASR-paper
:fire: ASR教程: https://dataxujing.github.io/ASR-paper/ |
|
Emerging |
| 57 |
oleges1/quartznet-pytorch
Quartznet implementation on pytorch [https://arxiv.org/abs/1910.10261] |
|
Emerging |
| 58 |
ondrejklejch/learning_to_adapt
Coordinate-wise meta-learner for speaker adaptation of ASR models. |
|
Emerging |
| 59 |
upskyy/ContextNet
PyTorch implementation of "ContextNet: Improving Convolutional Neural... |
|
Emerging |
| 60 |
vectominist/End-to-end-ASR-Pytorch-DLHLP
Joint CTC-Attention End-to-end Speech Recognition - PyTorch Implementation... |
|
Emerging |
| 61 |
clarinsi/Slovene_ASR_e2e
Automatic Speech Recognition tool |
|
Emerging |
| 62 |
nemoramo/acoustic_model
This is a sub-repository in building to create acoustic model in Mandarin... |
|
Emerging |
| 63 |
ThetaOne-AI/HiKE
Hierarchical Korean-English Code-Switching Speech Recognition Benchmark... |
|
Experimental |
| 64 |
daveshap/keras_asr
ASR experiment using Google's Universal Sentence Encoder |
|
Experimental |
| 65 |
teamtee/LLM-ASR-Error-Correction
This is a framework for using large language models to improve ASR... |
|
Experimental |
| 66 |
emonosuke/emoASR
End-to-end MOdeling of ASR (Automatic Speech Recognition) |
|
Experimental |
| 67 |
aws-samples/seq2seq-asr-misbehaves
Artifacts for the paper "Attentional Speech Recognition Models Misbehave on... |
|
Experimental |
| 68 |
aalto-speech/speechbrain-cl
Implementation of different curriculum learning (CL) methods for... |
|
Experimental |
| 69 |
PigeonDan1/ps-slm
TASU: A New Style of Alignment of Speech LLM with only Text Training Data,... |
|
Experimental |
| 70 |
kouyt5/lightning-asr
基于pytorch-lighting框架搭建的端到端语音识别模型,目前还在实验中,性能在不断优化 |
|
Experimental |
| 71 |
viig99/esolafast
Fast C++ implementation of ESOLA using KFRLib, can be used for online... |
|
Experimental |
| 72 |
tongjinle123/speech-transformer-pytorch_lightning
ASR project with pytorch-lightning |
|
Experimental |
| 73 |
vectominist/rspin
Official inference code for NAACL 2024 paper "R-Spin: Efficient Speaker and... |
|
Experimental |
| 74 |
DanielLin94144/Test-time-adaptation-ASR-SUTA
Test-time adaptation for speech recognition model by single utterance. The... |
|
Experimental |
| 75 |
biyoml/PyTorch-End-to-End-ASR-on-TIMIT
Attention-based end-to-end ASR on TIMIT in PyTorch |
|
Experimental |
| 76 |
shockless/asr-transformer
Transformer for Automatic Speech Recognition |
|
Experimental |
| 77 |
lucadellalib/ts-asr
Target speaker automatic speech recognition (TS-ASR) |
|
Experimental |
| 78 |
nttcslab-sp/torchain
WIP: pytorch FFI wrapper for Kaldi chain loss (a.k.a. Lattice Free MMI) |
|
Experimental |
| 79 |
dobby-seo/kosr
Korean speech recognition based on transformer (트랜스포머 기반 한국어 음성 인식) |
|
Experimental |
| 80 |
1ytic/edit-distance-papers
A curated list of papers dedicated to edit-distance as objective function |
|
Experimental |
| 81 |
SpringerNLP/Chapter12
Chapter 12: End-to-end Speech Recognition |
|
Experimental |
| 82 |
upskyy/Automatic-Speech-Recognition-Models
End-to-End Korean Automatic Speech Recognition leveraging PyTorch and Hydra. |
|
Experimental |
| 83 |
yinruiqing/tiny-transducer
Tiny Transducer: A Highly-Efficient Speech Recognition Model on Edge Devices |
|
Experimental |
| 84 |
sunprinceS/MetaASR-CrossAccent
Meta-Learning for End-to-End ASR |
|
Experimental |
| 85 |
Kirili4ik/QuartzNet-ASR-pytorch
Automatic Speech Recognition (ASR) model QuartzNet trained on English... |
|
Experimental |
| 86 |
andybi7676/reborn-uasr
REBORN: Reinforcement-Learned Boundary Segmentation with Iterative Training... |
|
Experimental |
| 87 |
awasthiabhijeet/Error-Driven-ASR-Personalization
Code for "Error-driven Fixed-Budget ASR Personalization for Accented... |
|
Experimental |
| 88 |
erasedwalt/CTC-ASR
An implementation of Jasper, QuartzNet, Citrinet and pipeline for training... |
|
Experimental |
| 89 |
TeaPoly/AIF-PyTorch
(NOT Official) Implementation Auto-regressive Integrate-and-Fire (AIF) |
|
Experimental |
| 90 |
tuanio/deepspeech-ctc
Deepspeech with ctc loss on Vivos Vietnamese Dataset |
|
Experimental |
| 91 |
umitkacar/transformer-asr-transcription
Real-time transformer-based ASR supporting 100+ languages - Google Cloud... |
|
Experimental |
| 92 |
msalhab96/RNN-Transducer
PyTorch implementation of Sequence Transduction with Recurrent Neural... |
|
Experimental |
| 93 |
tuanio/e2e-asr-toolkit
E2E Speech Recognition Toolkit with Hydra and Pytorch Lightning |
|
Experimental |
| 94 |
gheyret/uyghur-asr-transformer
Speech Recognition for Uyghur using Speech transformer |
|
Experimental |
| 95 |
DuyguA/TSD2025-Mind-the-Gap
Innovative ASR model to keep named entities intact, offered as a conference paper. |
|
Experimental |
| 96 |
mict-zhaw/chall_e2e_stt
End-to-end ASR experiments for language learning, focusing on... |
|
Experimental |
| 97 |
AssemblyAI-Community/intro-to-espnet
Getting Started with ESPnet | AssemblyAI |
|
Experimental |
| 98 |
Lakshmi-bashyam/NeuralLM2Arpa
Implementation of conversion system : Neural Language models to backing off... |
|
Experimental |
| 99 |
pragyak412/Improving-Voice-Separation-by-Incorporating-End-To-End-Speech-Recognition
Implementing the paper - |
|
Experimental |
| 100 |
chrarvi/automatic-speech-recognition
An automatic speech recognition transformer for converting swedish voice to text. |
|
Experimental |
| 101 |
AppleHolic/2020AIChallengeSpeechRecognition
2020 AI Challenge 음성 인식 코드 |
|
Experimental |
| 102 |
xingchensong/ASR-Wavnet
some ASR-system implementations (via tensorflow 1.x) |
|
Experimental |
| 103 |
MorrisXu-Driving/Improving_DeepSpeech_2_by_RNN_Transducer_Pytorch_Implementation
In this repository, based on Deep Speech 2, two losses, CTC and RNN-T are compared. |
|
Experimental |
| 104 |
shahad-mahmud/incremental_learning_for_asr
Incremental learning for automatic speech recognition (ASR) |
|
Experimental |
| 105 |
zyascend/End-to-End-Speech-Recognition-Learning
ASR, End-to-End, end2end, Speech Recognition, 端到端语音识别 |
|
Experimental |
| 106 |
upskyy/RNN-Transducer
PyTorch Implementation of RNN-Transducer |
|
Experimental |
| 107 |
khaykingleb/automatic-speech-recognition
QuartzNet and DeepSpeech implementation for ASR |
|
Experimental |
| 108 |
avrtt/MoE-speech-recognition
Mixture of experts architecture for speech-to-text and language... |
|
Experimental |
| 109 |
zw76859420/ASR_Transformer
A Pytorch implementation of Speech Transformer, an End-to-End Automatic... |
|
Experimental |