טכניון מכון טכנולוגי לישראל
הטכניון מכון טכנולוגי לישראל - בית הספר ללימודי מוסמכים  
M.Sc Thesis
M.Sc StudentBarzilai Aviad
SubjectConvex Multi-Task Learning by Clustering
DepartmentDepartment of Electrical Engineering
Supervisor Professor Yacov Crammer
Full Thesis textFull thesis text - English Version


Abstract

We consider the problem of multitask learning in which tasks belong to hidden clusters. Our problem begins with a set of known tasks, which we believe can be clustered into super tasks. Such is the case, for example, when we would like to classify spoken emotions or written sentiment.

The problem of classifying emotions can be divided into tasks based on the speaker or spoken text. While we can learn a classifier for each task, we would like to benefit from the shared information from all tasks. One can assume that when learning to classify emotions on a per speaker level, hidden attributes such as gender or age may form hidden clusters within the data, and jointly learning these clusters can improve learning. In the case of sentiment analysis, hidden clusters may exist based on different levels of emotions exhibited within the text, by the writers.

The problem is formulated as jointly clustering tasks and building individual predictors based on the clusters. The problem can also be viewed as building an individual predictor for each task, yet restricting all such predictors to be based on a shared set of possible atoms. From the initial non-convex formulation we derive a novel convex optimization problem. We derive our approach for support vector machines and generalize it to other loss functions.

We propose a scalable optimization algorithm for finding the optimal solution. Our optimization suggests using gradient projection in which we combine between classifying the data based on the clusters and improving the clusters based on the classifiers.

Experiments show that our suggested approach both achieves state-of-the-art performance and successfully identifies latent structures within the data. We analyze and illustrate the hidden structures showing the power of the provided approach in uncovering meaningful connections between tasks.