hirofumi0810/tensorflow_end2end_speech_recognition

End-to-End speech recognition implementation base on TensorFlow (CTC, Attention, and MTL training)

50
/ 100
Established

Implements multiple encoder architectures (BLSTM, GRU, VGG variants) with diverse decoding strategies including CTC with beam search and attention mechanisms (Bahdanau, Luong, location-based). Supports joint CTC-Attention training, multi-GPU synchronous training, and multi-task learning with auxiliary CTC losses at arbitrary encoder layers. Handles multiple datasets (TIMIT, LibriSpeech, CSJ) across various output units (phonemes, characters, words, kanji) with preprocessing via a separate dedicated repository.

314 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 24 / 25

How are scores calculated?

Stars

314

Forks

119

Language

Python

License

MIT

Last pushed

Jan 23, 2018

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/hirofumi0810/tensorflow_end2end_speech_recognition"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.