shessam/DSR

Throughout history, Altough there has been significant research in the field of speech recognition, there are still some unsolved distant speech recognition (DSR) challenges, e.g., reverberation and background noise; hence there is a need for more robust speech recognizers. An approach to overcome the mentioned problems could be robust acoustic modeling in DSR. Yet, there has not been a classical/deep learning method to make the acoustic model robust against the aforementioned problems all at once. In the thesis, in order to dereverberate the input sound, we have employed weighted- prediction-error (WPE) algorithm and asymmetric-context-windows (ACW) method. Furthermore, in order to improve robustness and accuracy of multi-channel DSR and audio source direction finding, we have utilized an existing hidden Markov model-bidirectional quaternion long short-term memory (HMM-BQLSTM) hybrid acoustic model. Using four microphone inputs, the quaternion nature of BQLSTM neural network allows us to learn inter- and intra- structural dependencies. Additionally, the BQLSTM can learn long-term time domain dependencies with the help of its recurrent layers.

23
/ 100
Experimental

No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 4 / 25
Maturity 9 / 25
Community 10 / 25

How are scores calculated?

Stars

6

Forks

1

Language

Python

License

MIT

Last pushed

Oct 19, 2022

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/shessam/DSR"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.