M.Sc Thesis

M.Sc StudentManoorkar Krishna
SubjectMutual Information and Learning
DepartmentDepartment of Mathematics
Supervisor PROF. Amir Yehudayoff
Full Thesis textFull thesis text - English Version


Data privacy is one of the major concerns in machine learning and data analysis. The amount of information revealed by an algorithm can be quantified via the mutual information I(S; A(S)) between the input S and output A(S) of the algorithm. In this work, we show upper and lower bounds on the information used for solving the interior point problem and learning thresholds.

 We describe a recursive algorithm (StabRec) that solves the interior point problem using a small amount of information. The algorithm is also deterministic and computationally efficient. The analysis of the information cost uses a new problem we define called the m-intertwiner problem. The algorithm StabRec also yields a deterministic empirical learner for thresholds.

We use a Ramsey-theoretic approach for obtaining a lower bound on the information com-plexity of the interior point problem and of learning thresholds. The argument identifies non-trivial structures in any learning algorithm and exploits them to force the algorithm to reveal anon-trivial amount of information.

We also address the general connection between information complexity and other basic notions. Relying on prior works, we show that a hypothesis class has a finite Littlestone dimension if and only if its information complexity is finite. These results show an equivalence between the notions of online learning, approximate differential privacy, and learning using finite information.