M.Sc Thesis | |

M.Sc Student | Noy Asaf |
---|---|

Subject | Robust Forward Algorithms via PAC-Bayes and Laplace Distributions |

Department | Department of Electrical and Computer Engineering |

Supervisor | ASSOCIATE PROF. Yacov Crammer |

Laplace random variables are commonly used as noise models in many fields such as signal processing, communication and control. Since the Laplace distribution decays exponentially from its mean, it is considered heavy-tailed compared to the Gaussian distribution, and used to model systems with anomalies such as extreme noise levels or outliers contamination. Systems trained to deal with these anomalies tend to be robust to such noise.

Robust statistics is aimed for developing statistical methods that are not unduly affected by outliers. One of its key elements is the Huber loss function, extensively used in various applications including robust filtering of Laplace-noise, as it allows the effect of outliers to be reduced while treating non-outliers in a relatively standard way. PAC-Bayes bounds, introduced by McAllester (1999), are a specific family of theoretical bounds that relates empirical performance of algorithms to their expected one. Yet, there is still a gap between the statements emerging from the PAC-Bayes theory and algorithms that are actually analyzed by it. Specifically, to the best of our knowledge, no robust algorithms were analyzed nor derived via the PAC-Bayes framework.

In our work we use PAC-Bayes bounds based on Laplace-like distributions for developing new learning algorithms that possess appealing qualities, the foremost is outliers robustness. We investigate the key features of the Laplace-like family and show a new connection between Laplace-noise and the Huber loss, paving the way for a better understanding of the relation between noise and robustness. The contribution is twofold: it closes the gap between boosting algorithms and PAC-Bayes theory; and, develops a new Boosting-like algorithm which emerges from theory and naturally relates to Huber loss. This is as opposed to most Boosting algorithms that are highly susceptible to outliers. In particular, the BaLaBoost algorithm, which emerges from the LogLoss, outperforms the known l1-LogBoost (2010) over noisy data.

Additionally, we manage to analytically calculate the bound for the Laplace-Like family of distributions, and prove that for linearly-separable training data, after a certain change of variables, the problem is in fact convex. Experiments with synthetic and real-world data sets show that our algorithms outperform AdaBoost, and other Boosting algorithms known to be more robust: LogitBoost, GentleBoost, l1-LogBoost and Robust- Boost, in a wide range of input noise.