טכניון מכון טכנולוגי לישראל
הטכניון מכון טכנולוגי לישראל - בית הספר ללימודי מוסמכים  
M.Sc Thesis
M.Sc StudentAtkins Aviva
SubjectSpeech Enhancement Based on Adaptive Line Enhancer
DepartmentDepartment of Electrical Engineering
Supervisor Professor Israel Cohen
Full Thesis textFull thesis text - English Version


Abstract

In a real-life acoustic system noise is unavoidable. From the microphone self noise, to additive noise from other sources, to reverberations which are reflections of your own speech from walls and other obstacles arriving at different delays to the microphone, to echos caused by the coupling of the microphone and loudspeaker, the noise degrades the quality and the intelligibility of the speech signal. The process of suppressing the additive noise, to ``clean'' the noisy speech signal and improve its quality and intelligibility, is known as noise reduction or also as speech enhancement. The speech enhancement problem has been extensively studied, and developed methods are widely used in numerous applications such as telecommunications, teleconferencing, hearing aids, human-machine interfaces, and more; however, it remains a challenging problem to this day, as in many of the methods it is possible to improve the quality of the signal and reduce the noise but at the expense of some distortion to the signal. The most difficult scenario is when only a single microphone is available, known as the single-channel case, and though multiple microphones and microphone arrays are becoming more popular as it is getting easier to miniaturize the structures, the single microphone is still used.

One of the most difficult types of noise to reduce is nonstationary noise, i.e., noise that changes quickly over time, though if it has an underlying structure, such as being composed of harmonics, it is possible to exploit this in the noise reduction. In this thesis we develop a method to reduce harmonic nonstationary noise for the single channel case. We propose the use of a frequency-domain adaptive line enhancer (ALE), which consists of a delay element to create a reference signal from the input signal and an adaptive filter, where we use a combination of a forward adaptive filter and a backward non-causal filter with a harmonic noise indicator. We apply our proposed combined method to synthetic and real noise signals and demonstrate that our proposed method yields improved performance compared to other methods. 

As noise signals typically contain wide-band noise as well, our method to remove harmonic noise can be followed by a conventional spectral domain noise reduction method to remove the remaining wide-band noise. In this thesis we investigate the use of the autoregressive conditional heteroscedasticity (ARCH) model, whose generalized form is used extensively in financial applications, as part of the spectral domain noise reduction algorithm, and compare it to one of the most commonly used methods.

In the final part of the thesis, we consider the multi-microphone case. We use a combined noise field approach and develop a robust superdirective beamformer that enables control of the trade-off between white noise amplification and the directivity factor. We propose a one dimensional search algorithm to find the optimal regularization factor employed in the beamformer and demonstrate by simulations improved performance compared to a recent method.