טכניון מכון טכנולוגי לישראל
הטכניון מכון טכנולוגי לישראל - בית הספר ללימודי מוסמכים  
M.Sc Thesis
M.Sc StudentBrodianskiy Tali
SubjectSelf-Correcting XML Queries
DepartmentDepartment of Industrial Engineering and Management
Supervisor Professor Sara Cohen
Full Thesis textFull thesis text - English Version


Abstract

The growing need to integrate data from heterogeneous sources and to access data sources with irregular or incomplete contents is the main motivation for research into semi-structured data models and query languages for them. For irregular data, partial or incomplete query answers are important. Much research has been made in this direction during the last decade, both in theoretical and practical directions. Previous research focused on defining more suitable query languages, developing algorithms for returning incomplete answers and efficient search of XML documents with or without schemas. In this work, we consider the problem of querying semi-structured data from a different angle. Instead of returning partial answers to a query that is not satisfiable over a given database, we supply the "closest" satisfiable query (or number of "close" queries) over the same database. During this work, we developed two efficient algorithms for this purpose. The first

algorithm returns the "closest" query. If there are several equally "close" queries, then it chooses one of them. The second algorithm returns all "closest" queries. The first algorithm is efficient in input and the second one is efficient in input and output. In addition we implemented a system that uses these algorithms and allows the user to retrieve a satisfiable query (or a number of queries), given some query and schema.