Ph.D Thesis

Ph.D StudentTalmon Ronen
SubjectSupervised Speech Processing Based on Geometric Analysis
DepartmentDepartment of Electrical and Computer Engineering
Supervisors PROF. Israel Cohen
PROF. Sharon Gannot


This dissertation addresses the theory and applications of geometric and diffusion methods in speech processing. Speech processing is of great interest in many hands-free communication systems. This field has attracted significant research effort for several decades, but many aspects remain open and require further research. In this thesis, we address the problems of transient interference suppression, acoustic source localization, linear system parameterization, multi-channel speech enhancement, and various audio indexing and speech recognition tasks. We introduce compact representations for speech and audio signals, which enable efficient and improved digital speech processing. In particular, we present geometric analysis and diffusion geometry methods.

A special focus is given to the problem of transient interference suppression. The wide-spread assumption of stationary noise poses a major limitation on traditional speech enhancement algorithms. In particular, it makes them inadequate in transient interference environments, as transients are usually characterized by a sudden burst of sound. We circumvent this assumption by proposing an algorithm that learns the geometric structure of the transient interference using manifold learning and nonlocal diffusion filtering. We show that capturing the geometric structure of the signals enriches the a-priori assumed statistical model and enables good performance.

Another problem addressed in our research is the problem of modeling natural and artificial systems. This problem has a key role in signal processing applications and has long been a task that attracted considerable research effort. A predefined model is traditionally developed for every type of task or system, and then, the model parameters are estimated from observations. In this work we investigate a completely different approach. Without assuming any specific model, we aim at identifying the degrees of freedom of the system and its modes of variability. This approach provides a generic data-driven method for a wide variety of system types. We propose a general algorithm for parameterization of linear systems using diffusion geometry. The proposed algorithm is based on recent developments of spectral and nonlinear independent component analysis techniques, anisotropic kernels, and classical results from statistical signal processing and Fourier analysis. We claim that a given system can be viewed as a black box controlled by several independent parameters. By recovering these parameters, we reveal the actual degrees of freedom of the system and obtain its intrinsic modeling. These attractive features are extremely useful for system design, control and calibration.