טכניון מכון טכנולוגי לישראל
הטכניון מכון טכנולוגי לישראל - בית הספר ללימודי מוסמכים  
M.Sc Thesis
M.Sc StudentIftah Peretz
SubjectGPU Based Parallelization of the Chu-Liu-Edmonds Algorithm,
for the Dependency Parsing Problem
DepartmentDepartment of Industrial Engineering and Management
Supervisors Assistant Professor Reichart Roi
Professor Silberstein Mark
Full Thesis textFull thesis text - English Version


Abstract

The solution to the Minimum/Maximum Spanning Tree (MST) problem in its directed form has various applications for cornerstone problems in scientific fields today.

One such manifestation in the Natural Language Processing (NLP) field is in finding the right dependency structure in which each word token in a sentence is dependent on exactly one other word apart from the root node. Such dependencies represent both syntactic and semantic relations between the words and are useful in various NLP tasks such as translation, POS tagging, etc.

When a certain language doesn't have a rigid word order it is referred to as non-projective, such as the Finnish language. The algorithm that was introduced in solving these non-projective dependency structures is the Chu-Liu-Edmonds (CLE) algorithm. This dissertation creates a parallel approximation of the CLE algorithm. Whilst devising the parallel algorithm, a byproduct algorithm was created for detecting cycles in a connected tree. We provide the proofs for its completeness and soundness.

Graphics Processing Units (GPUs) in the recent years have allowed for developing parallel applications, making them General Purpose GPUs. Almost simultaneously, the CPU began to branch into a multi-core architecture in order to overcome power consumption problems of high frequency CPUs.  The newly introduced CLE approximation variant is easily portable to the GPU and to a multi-core CPU.

A comparative analysis of the empirical results is introduced in the given dissertation.