M.Sc Thesis

M.Sc StudentYazdi Ram
SubjectPerturbation Based Learning for Structured NLP Tasks with
Application to Dependency Parsing
DepartmentDepartment of Industrial Engineering and Management
Supervisors ASSOCIATE PROF. Roi Reichart
Full Thesis textFull thesis text - English Version


In many natural language processing (NLP) tasks we are required to produce a list of solutions instead of single one. This can stem from several reasons.  First, it can be a defining property of the task. In extractive summarization, for example, good summaries are those that consist of a diverse list of informative sentences extracted from the text.  In other cases, the members of the solution list are exploited to solve an end goal task. For example, dependency forests were used in order to improve machine translation and sentiment analysis.

In yet other cases, the best solution of the model is inaccurate due to limited expressive power or to non-exact parameter estimation. Here, generating a list of solutions can be useful as a first step towards learning a final high-quality solution. One way to generate a list of solutions is sampling candidate solutions from the model's solution space, reasoning that effective exploration of this space should yield high quality and diverse solutions. Unfortunately, sampling is often computationally hard and many works hence back-off to sub-optimal strategies such as extraction of the best scoring solutions of the model, which are not as diverse as sampled solutions. In this thesis, we propose a perturbation-based approach where sampling from a probabilistic model is computationally efficient. We present a learning algorithm for the variance of the perturbations, and empirically demonstrate its importance.

Moreover, while finding the argmax in our model is intractable, we propose an efficient and effective approximation. We apply our framework to cross-lingual dependency parsing across 72 corpora from 42 languages, to lightly supervised dependency parsing across 13 corpora from 12 languages, and to several part of speech (POS) tagging setups.

In all of them we demonstrate strong results in terms of both the quality of the entire solution list and of the final solution distilled from it.