Back to list

Single-Channel Speech Enhancement using Deep Learning

Authors Hülser, G.
Year 2017
Thesis Type Master's thesis
Topic Audio Signal Processing
Abstract The aim of this work is to implement a single channel speech enhancement algorithmutilizing machine learning techniques, in particular deep neural networks (DNNs). A large set of speech and noise data is collected to train a neural network model, which predicts time-frequency masks from noisy speech signals. The algorithm is tested using various additive noise sources and its performance is evaluated in terms of speech quality and intelligibility. Furthermore, the results are compared to those of a state of the art noise reduction system provided by HARMAN. By using bidirectional long short-term memory (BLSTM) and a frequency weighted loss function, an average improvement of up to 0.3 PESQ and 0.06 STOI compared to the baseline algorithm is achieved. Moreover, a speech recognition benchmark showed an improvement of 8% in terms of speech accuracy.
Supervisors Sontacchi, A., Bauer, G.