טכניון מכון טכנולוגי לישראל
הטכניון מכון טכנולוגי לישראל - בית הספר ללימודי מוסמכים  
Ph.D Thesis
Ph.D StudentZiser Yftah
SubjectDomain Adaptation for Natural Language Processing - a Neural
Network Based Approach
DepartmentDepartment of Industrial Engineering and Management
Supervisor Professor Roi Reichart
Full Thesis textFull thesis text - English Version


Abstract

The objective of this research is to investigate and develop linguistically informed domain adaptation algorithms for natural language processing applications. Domain adaptation is a field associated with machine learning and transfer learning. This scenario arises when we aim at learning from a source data distribution a well performing model on a different (but related) target data distribution. The main line of work in domain adaptation is representation learning, i.e. learning a shared low dimensional representation for both source and target domains. In Ziser and Reichart (2017b) we introduced a neural network model that marries together ideas from two prominent strands of research on domain adaptation through representation learning: the SCL and autoencoder neural networks (NNs). Our model is a three-layer NN that learns to encode the non-pivot features of an input example into a low-dimensional representation, so that the existence of pivot features (features that are prominent in both domains and convey useful information for the NLP task) in the example can be decoded from that representation. The low-dimensional representation is then employed in a learning algorithm for the NLP task. In Ziser and Reichart (2018b) we introduced more linguistically aware approach, which takes into account word order and context, both crucial for natural language tasks. Our model processes

 the information in the text with a sequential NN (LSTM) and its output consists of a context- dependent representation vector for every input word. The model operates very similarly to LSTM language models (LSTM-LMs). The fundamental difference is that while for every input word

LSTM-LMs outputs a hidden vector and a prediction of the next word, the output of PBLM is a hidden vector and a prediction of the next word if that word is a pivot feature or else, a generic

NONE tag. In Ziser and Reichart (2018a) we expand our work to cross-lingual domain adaptation, i.e., the task of training a model on labeled data from one (language, domain) pair so that it can be effectively applied to another (language, domain) pair. We need to overcome both the language and domain gaps. We close both gaps using an alternative definition of the pivot features and a bilingual word embeddings. In addition, we introduce a solution for true resource-poor languages, in which even unlabeled data is hard to obtain and present this setup as the ”lazy” setup. While PBLM achieves state-of-the-art results, this approach is still challenged by the large pivot detection problem that should be solved, and by the inherent instability of LSTMs. In Ziser and Reichart (2019) we propose a Task Refinement Learning (TRL) approach, in order to solve these problems. Our algorithms iteratively train the PBLM model, gradually increasing the information exposed about each pivot TRL-PBLM achieves stateof-the-art accuracy in six domain adaptation setups for sentiment classification. Moreover, it is much more stable than plain PBLM across model configurations, making the model much better fitted for practical use.