טכניון מכון טכנולוגי לישראל
הטכניון מכון טכנולוגי לישראל - בית הספר ללימודי מוסמכים  
M.Sc Thesis
M.Sc StudentTsitrin Yan
SubjectMaster-Slave Dependency Model and its Application to the
Hebrew Understanding
DepartmentDepartment of Computer Science
Supervisor Professor Uzi Ornan
Full Thesis text - in Hebrew Full thesis text - Hebrew Version


Abstract

In this thesis we present a new model for sentences analysis called MASTER-SLAVE dependency model. In this formalism an utterance meaning is related as a tree-based structure which is called Meaning Tree. The tree’s nodes stand for the concepts expressed in the utterance (we call them semantic nuclei) and its edges (or connectors) represent conceptual or functional relations between the nuclei. A Meaning Tree is a formal meaning representation in the computer, so we say that the computer understands an utterance when it finds such a tree. We propose a structure allowing storing multiple trees representing meaning of a single sentence (Utterance Graph). All the meaning trees of an utterance can be inferred from its utterance graph. We also propose a hierarchy of grammatical categories of the semantic nuclei. The higher the nucleus’ category is in the hierarchy, the more important for the utterance meaning the nucleus is.  We have investigated also the internal structure of the nucleus that relates, among other things, its ability to establish connections with other sentence’s nuclei (nucleus’ valency). In our model we use the concept of valency for all the categories of nuclei, not only for verbs, as it is accepted in other works. The utterance analyzing process is executed in two stages:

1.      Utterance Graph creating - for all the tokens of the analyzed utterance we find all the nuclei the token can represent and then we create all possible connectors between these nuclei;

2.      Based on the created Utterance Graph all the consistent Meaning Trees that the graph contains are inferred.

It should be noted that the connectors’ creation and their analysis are performed according to the hierarchy we have built and does not depend on the tokens’ order in the sentence. We have developed algorithms for both the stages and have implemented them. The algorithms’ complexity has been analyzed and the program performance was evaluated. From the experiments we conducted it follows that the first (and correct) meaning-tree may be obtained quite quickly.

Many examples of sentences in different languages (Hebrew, English, Russian, German, Polish, etc) are presented, together with the structures the analyzer assigns them. We consider, as well, the using of our program as a model in more general applications, such as Dialog Systems, Machine Translation Systems, Text Summarizers and Search Engines. As a conclusion we suggest some directions for extending and improving the project.