|M.Sc Student||Orenbach Meni|
|Subject||Evaluating User Similarity Search in OnLine Social|
|Department||Department of Electrical Engineering||Supervisor||Professor Idit Keidar|
|Full Thesis text|
User similarity search is a useful feature provided by online social networks for recommending friends and content. Today, social networks have grown in size to serve hundred of millions of users. Scalability can be achieved by partitioning an inverted index and performing user similarity search in a decentralized manner. In this work we evaluate different techniques to perform decentralized user similarity search via a partitioned inverted index.
We have implemented an infrastructure that simulates various techniques to partition an inverted index. Within this infrastructure we have implemented existing techniques and also introduced new techniques based on observations and previous studies that analyzed social networks properties. E.g., partition the index into clusters of users according to their similarities, or by their social interests. We use previously suggested analysis which showed that users in social communities tend to have similar interests, and create clusters based on their social community aﬃliation. When clustering by social interests we achieve scalability by selecting a partial subset of them based on their distribution in the social network.
We use information retrieval ranking techniques, and measure each partitioned index’s results against an ideal centralized index. We experiment with diﬀerent limits on the indices, e.g., bounding their size, and show how it eﬀects the eﬃciency of the search. Through evaluation of real social networks traces, we show that partitioning the index into similarity-based clusters outperforms interests-based clusters.