M.Sc Student | Anna Oyzerman |
---|---|

Subject | Speech Dereverberation in the Time-Frequency Domain |

Department | Department of Electrical Engineering |

Supervisor | Full Professor Cohen Israel |

Full Thesis text |

In this thesis we address the problem of dereverberation of speech signals, acquired in an enclosed space by a single microphone. We spectrally estimate and suppress the dereverberation in the time-frequency domain, using two representations: The short-time Fourier transform (STFT), and the Single-Side-Band transform (SSB).

In the first part of this thesis, the problems of system identification and dereverberation are addressed in the SSB domain. We derive an analytical relation between the input and output signals in the SSB domain, and formulate a system identification routine for a band-to-band approximation of that relation. The dereverberation problem is addressed using a statistical model for the acoustic impulse response (AIR) function. We present exact and approximate representations of the AIR and the reverberant signal in the SSB domain. The performance of the dereverberation algorithm is evaluated as a function of the representation complexity, and was compared to the performance of dereverberation in the STFT domain.

In the second part of the thesis, we propose a new algorithm for the estimation of the reverberant component, which is a critical part of dereverberation in the spectral domain. In order to address the frequency variation of the reflection coefficients in the room, we formulate the problem directly in the STFT domain.

First, a system identification stage is performed. A white noise signal is played in the reverberant environment, and a filter is extracted relating the recorded output signal to its reverberant component. In order to reduce the complexity of the computation, we propose approximate representations of the filter that use only some or none of the crossband filters. The latter case is referred to as a ''band-to-band filter".

The performance of the system identification stage is analyzed in terms of the mean-square error (MSE) between the actual reverberant component and its estimate by the approximate representations. It is shown that the smallest MSE is achieved by using the band-to-band filter, due to the fact that the expression that relates the output signal to its reverberant component is most accurate in that case. Also, it was found that for a small number of crossbands, it is advantageous to increase the lengths of the filters in order to improve the estimate of the reverberant component.

Finally, we use the reverberant component estimate in a spectral enhancement algorithm for dereverberation. We show that our method achieves better results than an existing statistical method.