Ph.D Thesis | |

Ph.D Student | Begleiter Ron |
---|---|

Subject | Theory and Practice of Active Learning |

Department | Department of Computer Science |

Supervisors | ASSOCIATE PROF. Nir Ailon |

PROF. Ran El-Yaniv | |

Full Thesis text |

The active-learning model is concerned with ways to expedite the learning phase through interaction with the labeling process.

In a pool-based active-learning setting, the learning algorithm receives a set of unlabeled examples, as well as access to an oracle that contains the full labeling information on that particular set. The learner's goal is to produce a nearly optimal hypothesis, while requiring the minimum possible interactions with the oracle.

In
this thesis, we present a novel smoothness condition over empirical risk error
estimators and show its usefulness for active pool-based learning.

Instead of estimating the risk directly, we
target regrets relative to pivot hypotheses. We show that such smooth relative
regret estimators yield an active-learning algorithm that converges to a
competitive solution at a fast rate.

We
show three specific constructions of such smooth estimators. The first is
obtained

when the only assumptions made are bounds on the
disagreement coefficient and the VC dimension.

This leads to an active-learning algorithm that almost matches the best-known algorithms that use the same assumptions.

On the other hand, we present two problems of vast interest, for which a direct analysis of the relative regret function produces state-of-the-art learning strategies. The two problems we study are concerned with learning relations over a ground set, where one problem deals with order relations and the other with equivalence relations (with a bounded number of equivalence classes).

Our
smoothness condition, we argue, influences sampling methods that should be
carefully biased in a way that incorporates exploration of all hypothesis
space, along

with exploitation of a current candidate
solution. Following this idea, we present a heuristic that enhances any
active-learning algorithm with systematic explorations.

We show that this heuristic significantly
improves leading active-learning heuristics within a graph-based transductive
setting.