|M.Sc Student||Koren Tomer|
|Subject||Learning Linear Support Vector Machines in Sublinear Time|
|Department||Department of Computer Science||Supervisor||Professor Elad Hazan|
|Full Thesis text|
In recent years, stochastic approximation approaches such as Stochastic Gradient Descent (SGD) and Stochastic Dual Averaging have become the optimization method of choice for many learning problems, including linear Support Vector Machines (SVMs). This is not surprising, since such methods yield optimal generalization guarantees with only a single pass over the data. They in a sense have an unbeatable runtime: the runtime to get to a desired generalization goal is the same as the size of the data set required to do so.
In this work we show, for the first time, how to beat this unbeatable runtime, and present a method that in a certain relevant regime, learns linear SVMs in runtime less than the size of the minimal training set size required for learning. The key here, is that unlike traditional methods that consider an entire training instance vector at each iteration, our method accesses single features of training vectors. Our method is based on a stochastic primal-dual approach, where the primal step is akin to an importance-weighted SGD, and the dual step is a stochastic update on the importance weights.