Ph.D Thesis | |

Ph.D Student | Glazer Assaf |
---|---|

Subject | Learning Methods for Modeling High-Dimensional Distributions |

Department | Department of Computer Science |

Supervisors | PROF. Shaul Markovitch |

PROF. Michael Lindenbaum | |

Full Thesis text |

A
reliable density estimation is hard to obtain in problems of high-dimensional
data, especially when the sample used for estimation is small. As a result,
various studies have tried to find approximate solutions to this problem by
reducing it to a less general, and hopefully solvable, form. One prominent
approach in this direction is estimating the *minimum-volume set (MV-set)*
of a distribution at level instead
of its density function. (An MV-set at level is
a subset of the input space with probability mass of at least that
has the smallest volume.) However, even a perfectly estimated MV-set reveals
only partial information about the distribution. Can we define a problem whose
solution is more informative than MV-set estimation, yet is easier to solve
than density estimation?

In
this dissertation we introduce novel methods that do just that. Our methods,
which can also be regarded as generalized quantile functions, are based on the
idea of estimating (or approximating) hierarchical MV-sets for distribution
representation in high-dimensional data. In most of our proposed methods, we
use the *one-class SVM (OCSVM)* algorithm to estimate the hierarchical
MV-sets. Note that a straightforward approach of training a set of *OCSVMs*,
one for each MV-set, would not necessarily satisfy the hierarchy requirement.
We thus introduce novel variants of the *OCSVM* algorithm that find all
estimated MV-sets such that the hierarchy constraint is fulfilled.

We provide theoretical and empirical justifications for our methods in the general context of estimating hierarchical MV-sets. In addition, we apply our methods and show their superiority over competitors in various domains including concept drift detection, topic change detection in document streams, background subtraction in image sequences, and hierarchical clustering.