טכניון מכון טכנולוגי לישראל
הטכניון מכון טכנולוגי לישראל - בית הספר ללימודי מוסמכים  
M.Sc Thesis
M.Sc StudentFeldman Yair
SubjectMulti-Hop Paragraph Retrieval for Open-Domain Question
Answering
DepartmentDepartment of Computer Science
Supervisor Professor Ran El-Yaniv
Full Thesis textFull thesis text - English Version


Abstract

Question Answering (QA) is one of the core tasks in natural language understanding.

This task requires the ability to process and understand natural language questions

and documents, as well as to extract the answers to these questions. There are several

settings for the QA task, which provide varying levels of difficulty. In the most simple

setting, a question is paired with a context which is guaranteed to contain sufficient

evidence to answer the question. A more difficult setting is called multi-hop QA, in

which multiple hops of reasoning are required in order to derive the correct answer from the given context. Another interesting setting is open-domain QA, in which a question is given without an accompanying context, and instead the relevant context must be retrieved from a large knowledge source, e.g. Wikipedia.

In this work, we are concerned with the task of multi-hop open-domain QA. This

task is particularly challenging since it requires the simultaneous performance of textual reasoning and efficient searching. We present a method for retrieving multiple

supporting paragraphs, nested amidst a large knowledge base, which contain the necessary evidence to answer a given question. Our method iteratively retrieves supporting paragraphs by forming a joint vector representation of both a question and a paragraph. The retrieval is performed by considering contextualized sentence-level representations of the paragraphs in the knowledge source. Our method achieves state-of-the-art performance over two well-known datasets, SQuAD-Open and HotpotQA, which serve as our single- and multi-hop open-domain QA benchmarks, respectively.