טכניון מכון טכנולוגי לישראל
הטכניון מכון טכנולוגי לישראל - בית הספר ללימודי מוסמכים  
M.Sc Thesis
M.Sc StudentGutter Israel
SubjectSentence Parsing in Hebrew by Semantic Features
DepartmentDepartment of Computer Science
Supervisor Professor Uzi Ornan


Abstract

   This work proposes an algorithm for parsing a sentence in Hebrew mainly by matching the sentence components to a verb in the sentence as implementations of  the verb’s thematic roles.


   The identification of role implementations which are attached to the verb is mainly done by means of semantic considerations. However, the identification also treats any  other relevant signs of syntax, e.g. agreement in person, gender and number, prepositions etc.


   The algorithm is implementable and can handle many phenomena in the language, such as free order of components, ambiguity etc.


   The ability to implement the algorithm depends upon the existence of semantic dictionaries which can supply the full data concerning the thematic roles that are principally defined for the various verbs, as well as the semantic data concerning other categories of words.


   The semantic dictionaries have been defined for groups of words that were mainly chosen from the domain of football games. These groups include about 600 verb items, 800 noun items as well as additional numbers, adjectives, adverbs, question words etc. The dictionaries express connections and similarities between different values (and their inner parts) by means of strong inheritance. In addition, a hierarchical system of semantic features is used, as well as predicates expressing world knowledge. Adjectives in our relevant dictionary can have thematic roles, just like verbs.


   Such an algorithm for semantic parsing has also been implemented.

The implementation uses a morphlogical analyzer that has been made available by the “Multitext” company for the purposes of this work.