|Ph.D Student||Epstein Baruch|
|Subject||Learning and Prior Knowledge about Structure: Theroy and|
|Department||Department of Electrical and Computers Engineering||Supervisor||PROF. Ron Meir|
|Full Thesis text|
This work explores the interplay between learning and structure-related prior knowledge in two directions.
The first direction considers multi-task learning under the assumption that all the tasks share some common features but not all. That is, there exists a representation subspace that is useful for all tasks, as well as subspaces specific to each task. We address both the linear case (in which the representations are linear projections) which is in essence an extension of PCA, and the non-linear case, for which we propose a deep learning architecture suitable for extracting shared and task-specific representations, and which can be regarded as an extension of autoencoders.
The second direction considers prior knowledge about the data distribution and its relationship to the hypothesis space. First, we adapt existing deep learning theory to obtain a generalization bound for autoencoders. We then show that under a suitable clustering assumption, if the data admits a well-performing autoencoder, then this autoencoder can improve any supervised learning algorithm in a semi-supervised fashion.
Last, we prove a novel PAC-Bayes bound depending on the Wasserstein distance rather than the usual KL divergence, under the assumption that the generalization is smooth as a function of the hypothesis. Empirical evaluations show that this bound yields non-vacuous generalization guarantees for deep learning on par with or better than existing approaches.