Ph.D Thesis

Ph.D StudentAdler Amir
SubjectApplications of Sparse and Redundant Representations to
Subspace Clustering and Signal Modeling
DepartmentDepartment of Computer Science
Supervisors PROF. Michael Elad
PROF. Yacov Hel-Or


Sparse and Redundant Representations (SRR) provide powerful tools for signal and image processing, often leading to state-of-the-art results in numerous fundamental problems such as signal and image restoration, compression, enhancement, classification and more. The core principle of SRR is to model a given signal as the linear combination of few columns, termed atoms, from a dictionary matrix. The given signal is represented using a sparse vector, where non-zero entries are the coefficients of the respective atoms of the dictionary, and the process of estimating the sparse representation vector is termed sparse coding.

This research thesis presents new applications and extensions of SRR for the solution of three problems: The first is the restoration of missing audio samples in audio streams: speech and music signals are often subject to degraded quality due to consecutive missing samples as a result of errors during recording, transmission or reproduction. In this research we present an SRR-based formulation for several types of degradations as well a sparse coding approach for high-quality recovery of the missing samples.

The second problem dealt with in this thesis is concerned with linear-time subspace clustering for processing very large data sets: subspace clustering is the unsupervised learning problem of clustering data points drawn from a union-of-subspaces. This finds applications in motion and temporal video segmentation, face clustering and more. In our research we present two approaches that obtain linear-time subspace clustering using learning algorithms that model the relations between dictionary atoms to signals in the data set.

The third problem is concerned with simultaneous sparse coding and anomaly detection: this problem arises whenever an entire data set is assumed to conform with an SRR model, whereas the anomaly is caused by an unknown subset of the data vectors - the outliers - which significantly deviate from the SRR model. We propose a convex optimization-based approach for simultaneous recovery of the sparse representations and the outliers' components of the entire data set, and demonstrate this in true applications.