טכניון מכון טכנולוגי לישראל
הטכניון מכון טכנולוגי לישראל - בית הספר ללימודי מוסמכים  
M.Sc Thesis
M.Sc StudentKraus Naama
SubjectContext-Aware Query Suggestion
DepartmentDepartment of Computer Science
Supervisors Mr. Ziv Bar-Yossef
Professor Shaul Markovitch
Full Thesis textFull thesis text - English Version


Abstract

Query auto completion is known to provide poor predictions of the user's query when her input prefix is very short (e.g., one or two characters). In this study we show that context, such as the user's recent queries, can be used to improve the prediction quality considerably even for such short prefixes. We propose a context-sensitive query auto completion algorithm, NearestCompletion, which outputs the completions of the user's input that are most similar to the context queries. To measure similarity, we represent queries and contexts as high-dimensional term-weighted vectors and resort to cosine similarity.  The mapping from queries to vectors is done through a new query expansion technique that we introduce, which expands a query by traversing the query recommendation tree rooted at the query.


In order to evaluate our approach, we performed extensive experimentation over the public AOL query log.  We demonstrate that when the recent user's queries are relevant to the current query she is typing, then after typing a single character, NearestCompletion's Mean Reciprocal Rank (MRR) is 48% higher relative to the MRR of the standard MostPopularCompletion algorithm on average. When the context is irrelevant, however, NearestCompletion's MRR is essentially zero.  To mitigate this problem, we propose HybridCompletion, which is a hybrid of NearestCompletion with MostPopularCompletion. HybridCompletion is shown to dominate both NearestCompletion and MostPopularCompletion, achieving a total improvement of 31.5% in MRR relative to MostPopularCompletion on average.