Ph.D Thesis

Ph.D StudentVainsencher Daniel
SubjectGeneralization and Unruly Data
DepartmentDepartment of Electrical and Computer Engineering
Supervisor PROF. Shie Mannor


Feature learning is an important subtask of machine learning concerned with exploiting structure inherent in the data to identify features that represent the data and allow for efficient prediction. There are many important and open theoretical questions about feature learning. This thesis is concerned with two topics relevant to feature learning. The first is the sample complexity of Dictionary Learning, a direct formulation for feature learning when data is assumed to be sparsely representable in an unknown dictionary. The second is general, practical, scalable algorithms that are resistant to a small percent of outliers in the data and generalize well to unseen data after training with a finite number of samples; we address this topic in two distinct settings: learning a single model (e.g., dimensionality reduction), and learning multiple models (e.g., vector quantization for bag of words in vision applications). The novel Regularized Weighting framework and its associated sample complexity analysis techniques are easily applicable to new problems of robust estimation beyond those mentioned above; the framework algorithms can be wrapped around any minimizer of weighted losses, as a black box.