hirofumi0810/tensorflow_end2end_speech_recognition

End-to-End speech recognition implementation base on TensorFlow (CTC, Attention, and MTL training)

/ 100

Established

Implements multiple encoder architectures (BLSTM, GRU, VGG variants) with diverse decoding strategies including CTC with beam search and attention mechanisms (Bahdanau, Luong, location-based). Supports joint CTC-Attention training, multi-GPU synchronous training, and multi-task learning with auxiliary CTC losses at arbitrary encoder layers. Handles multiple datasets (TIMIT, LibriSpeech, CSJ) across various output units (phonemes, characters, words, kanji) with preprocessing via a separate dedicated repository.

314 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 24 / 25

How are scores calculated?

Stars

314

Forks

119

Language

Python

License

MIT

Related tools

githubharald/CTCWordBeamSearch

Connectionist Temporal Classification (CTC) decoder with dictionary and language model.

githubharald/CTCDecoder

Connectionist Temporal Classification (CTC) decoding algorithms: best path, beam search, lexicon...

nl8590687/ASRT_SpeechRecognition

A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统

athena-team/athena

an open-source implementation of sequence-to-sequence based speech processing engine

rakeshvar/rnn_ctc

Recurrent Neural Network and Long Short Term Memory (LSTM) with Connectionist Temporal...

Explore Voice AI Tools

All categories Trending Voice AI directory Insights