M.Sc Thesis

M.Sc StudentMarkovich Shmuel
SubjectMultichannel Eigenspace Beamforming in a Reverberant Noisy
Environment with Multiple Interfering Speech
DepartmentDepartment of Electrical and Computer Engineering
Supervisors PROF. Sharon Gannot
PROF. Israel Cohen
Full Thesis textFull thesis text - English Version


In many practical environments we wish to extract several desired speech signals, which are contaminated by both non-stationary interfering signals (such as a competing talkers), as well as by stationary noise.  Furthermore, the received signals are often subject to distortion imposed by the Room Impulse Response (RIR) of the acoustic environment.

Typical examples for this problem include the conference call scenario with multiple participants; a hands-free cellular phone conversation in a car environment, when several speaking passengers interfere with the desired speaker; and the Cocktail Party scenario, in which desired conversation blend with many simultaneous conversations.

In this thesis multi-microphone measurements are utilized to perform the task of the desired speakers extraction, by designing the array beam-pattern to satisfy a set of multiple linear constraints. One subset of the constraints is dedicated to maintain the desired signals and the second subset is chosen to mitigate both the stationary and non-stationary interference signals. Unlike classical beamformers, in which the RIRs are approximated by a delay-only filter, we take into account the entire RIR [or its respective Acoustic Transfer Function (ATF)].

Firstly, we show that the Relative Transfer Functions (RTFs), defined as the ratio between ATFs relating the speech sources and the microphones, suffice for the construction of the beamformer. Secondly, the null subspace, comprised of all interfering signals, is estimated by using the union of all estimated eigenvectors, relaxing the commonly used demand that the interference signals' activity periods do not overlap. Finally, the Generalized Eigenvalue Decomposition (GEVD) procedure is applied to the received signals'  Power Spectral Density (PSD) matrix and the interference-only PSD matrix (obtained by the second stage) for estimating the RTF of the desired signals.

It is shown that an application of the adaptive Residual Noise Canceller (RNC) to the output of the beamformer enables further reduction of the residual interference signals, caused by inaccuracies in the subspace estimation, and hence increases the robustness of the proposed method.

A comprehensive experimental study, consisting of both simulated and real environments, proves the applicability of the proposed algorithm to the multiple source extraction task. Furthermore, it is shown that the proposed algorithm outperforms the Transfer Function Generalized Sidelobe Canceller (TF-GSC) algorithm, in the task of enhancing one desired speech signal contaminated by several interference signals.