M.Sc Thesis

M.Sc StudentEyal Ami
SubjectSelf Organizing Semantic Topologies in Peer Database
DepartmentDepartment of Industrial Engineering and Management
Supervisor PROF. Avigdor Gal
Full Thesis textFull thesis text - English Version


Peer database management systems (PDMS) combine the decentralized setting and autonomy of peer-to-peer systems with the rich semantic context of database systems. In a PDMS, members use schema matching techniques to establish schema mappings as the basis for peer querying. The large-scale and dynamic environments of peer-to-peer networks dictate the use of automatic schema matching, which was shown to carry with it a degree of uncertainty .

In the first part of this thesis, we introduce a model for a PDMS that considers the inherent uncertainty of automatic schema matching and the increase of this uncertainty over transitive matching in a decentralized environment. We examine the query reformulation quality in our model, influenced by the use of variant semantic network topologies. We analyze both local interest and global welfare of semantic topologies .

Next, we consider (offline) problems of finding optimal semantic topologies that maximize query reformulation quality. We present an algorithm to find such optimal topologies that maximize peers local interest. We also show that even in the presence of complete offline network knowledge, the problem of finding a topology that maximizes the global welfare is NP-Complete even for a very simple setting .

In the third part, we consider the (online) setting of a peer to peer system where no single peer obtains complete network knowledge. We present several heuristic (online) algorithms for topology self organization in the absence of full network knowledge .

In the final part, we present a PDMS simulation using our model and our online algorithms for topology self-organization. We compared our algorithms with similar algorithms in the context of peer file-sharing systems. Our results indicate that our algorithms better exploit interest-based locality in PDMS environment. We also show experimental results indicating that local interest interferes with global welfare in the context of optimal topology self establishment. Finally, we show that optimal topologies maximizing global welfare can not be reached by independent peers individually applying self organization algorithms, but rather require collaborative algorithms applied by peers in cooperation.