Ph.D Thesis

Ph.D StudentBegleiter Ron
SubjectTheory and Practice of Active Learning
DepartmentDepartment of Computer Science
Supervisors ASSOCIATE PROF. Nir Ailon
PROF. Ran El-Yaniv
Full Thesis textFull thesis text - English Version


The active-learning model is concerned with ways to expedite the learning phase through interaction with the labeling process.

In a pool-based active-learning setting, the learning algorithm receives a set of unlabeled examples, as well as access to an oracle that contains the full labeling information on that particular set. The learner's goal is to produce a nearly optimal hypothesis, while requiring the minimum possible interactions with the oracle.

In this thesis, we present a novel smoothness condition over empirical risk error estimators and show its usefulness for active pool-based learning.
Instead of estimating the risk directly, we target regrets relative to pivot hypotheses. We show that such smooth relative regret estimators yield an active-learning algorithm that converges to a competitive solution at a fast rate.

We show three specific constructions of such smooth estimators. The first is obtained
when the only assumptions made are bounds on the disagreement coefficient and the VC dimension.

This leads to an active-learning algorithm that almost matches the best-known algorithms that use the same assumptions.

On the other hand, we present two problems of vast interest, for which a direct analysis  of the relative regret function produces state-of-the-art learning strategies.  The two problems we study are concerned  with learning relations over a ground set, where one problem deals with order relations and the other with equivalence relations (with a bounded number of equivalence classes).

Our smoothness condition, we argue, influences sampling methods that should be carefully biased in a way that incorporates exploration of all hypothesis space, along
with exploitation of a current candidate solution. Following this idea, we present a heuristic that enhances any active-learning algorithm with systematic explorations.
We show that this heuristic significantly improves leading active-learning heuristics within a graph-based transductive setting.