טכניון מכון טכנולוגי לישראל
הטכניון מכון טכנולוגי לישראל - בית הספר ללימודי מוסמכים  
M.Sc Thesis
M.Sc StudentKatz Shay
SubjectList Selection for Fusion
DepartmentDepartment of Industrial Engineering and Management
Supervisor Professor Oren Kurland
Full Thesis textFull thesis text - English Version


Abstract

 
Fusion is the process of merging document lists that were retrieved from the same corpus in response to a query. Fusion-based retrieval is known to be an effective method to improve retrieval performance. In this work, we address the following problem: given a set of lists retrieved in response to a query, we want to select a subset of these lists to be fused so as to improve retrieval performance. We first present a probabilistic approach to the problem that  gives rise to a few basic methods. We then integrate these methods in a framework that utilizes information about retrieval performance for previous queries. The main idea behind our methods is group-based list selection, meaning that unlike other list selection methods, which only consider each list individually, our methods take into account all possible combinations of fusing the given lists and attempt to select the list combination which works best when fused.
Empirical evaluation shows that our methods of fusing a selected subset of lists outperforms (i) the fusion of all lists; and in many cases (ii) fusion of the subset of lists whose corresponding retrieval methods post the best average retrieval performance over a train set of queries. Additionally, our work shows a connection between the effectiveness of the fusion and the level of dependencies between the retrieval methods used to produce the lists.