Ph.D Thesis


Ph.D StudentShnitzer Dery Tal
SubjectOperator-Theoretic Approach for Manifold Learning with
Application to Multimodal and Temporal Data
Analysis
DepartmentDepartment of Electrical and Computers Engineering
Supervisor ASSOCIATE PROF. Ronen Talmon
Full Thesis textFull thesis text - English Version


Abstract

High-dimensional data analysis typically requires vast modeling assumptions or a large amount of labeled data. We address two fundamental problems involving high-dimensional data: filtering of temporal data arising from observations of dynamical systems and analysis of multimodal data. To overcome modeling and data limitations, we approach these problems using geometric modeling tools. We rely on two related concepts: manifold learning and operator-based data analysis. Manifold learning techniques assume that the data lie on some hidden manifold and provide a new low-dimensional representation of the data in a completely data-driven manner. The operator-based data analysis approach, linearizes highly complex and nonlinear problems by projecting the data measurements into a linear but infinite space of real functions, in which the data are analyzed through the propagation of functions using linear operators.


In this thesis, we present several methods for time-series analysis and for multimodal data analysis based on these two fundamental concepts in order to overcome the challenges posed by applications involving a limited amount of high-dimensional data. We rely on diffusion maps, a specific manifold learning technique, and propose two time-series filtering methods, which recover the intrinsic model of latent variables in dynamical system solely form the measurements, and allow to filter high-dimensional nonlinear time-series without requiring any prior knowledge on the system model. We utilize particular properties of the diffusion maps coordinate dynamics, which allow us to construct new representations that linearize even highly nonlinear settings, and propose two nonlinear filtering methods using a linear observer and the Kalman filter. We demonstrate these methods on a music analysis example and on a challenging real-world application of location estimation based on neuronal data. In the latter we show that given neuronal recordings, our method recovers new coordinates that are highly related to the true position of the animal in a purely data-driven manner.


In our work on multimodal data analysis, we propose two new operators based on diffusion operators, which allow to isolate, enhance and attenuate the hidden components of multimodal data in a data-driven manner. We provide theoretical justifications for these operators and analyze their asymptotic behavior. We then extend this setting by taking into account the particular structure of the diffusion operators and define two new operators for revealing similarly and differently expressed common components. These operators provide a novel comprehensive framework for the analysis of common components in multimodal data. We utilize these operators for time-series analysis as well. We extend existing methods and propose a novel manifold-based multiscale temporal analysis scheme, analogous to the wavelet decomposition, which addresses a setting of time-varying manifolds underlying data. Lastly, we show that another challenge can be addressed using these operators and propose a geometry-based feature selection method for two-class data.

We demonstrate the proposed methods on various applications including remote sensing with hyperspectral and LiDAR images and non-invasive fetal heart activity recovery from maternal abdominal measurements. Specifically, our approach successfully recovers the fetal heart rate from noisy maternal abdominal measurements, obtaining results that are comparable to state of the art.