mdangschat/ctc-asr
End-to-end trained speech recognition system, based on RNNs and the connectionist temporal classification (CTC) cost function.
Implements bidirectional RNN layers with dense layers trained on 900+ hours of multi-corpus audio data (LibriSpeech, Common Voice, TEDLIUM, Tatoeba), achieving 12.6% WER without external language models. Built on TensorFlow with configurable architecture parameters, supporting GPU acceleration and modular training/evaluation workflows via CSV-based corpus definitions. Includes utilities for multi-corpus preparation, checkpoint management, and real-time training visualization through TensorBoard.
123 stars. No commits in the last 6 months.
Stars
123
Forks
36
Language
Python
License
MIT
Category
Last pushed
Apr 15, 2020
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/mdangschat/ctc-asr"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
githubharald/CTCWordBeamSearch
Connectionist Temporal Classification (CTC) decoder with dictionary and language model.
githubharald/CTCDecoder
Connectionist Temporal Classification (CTC) decoding algorithms: best path, beam search, lexicon...
nl8590687/ASRT_SpeechRecognition
A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统
athena-team/athena
an open-source implementation of sequence-to-sequence based speech processing engine
hirofumi0810/tensorflow_end2end_speech_recognition
End-to-End speech recognition implementation base on TensorFlow (CTC, Attention, and MTL training)