M.Sc Thesis

M.Sc StudentSheetrit Eitam
SubjectImproving Process Matching Using Positional Passage-
Based Language Models
DepartmentDepartment of Industrial Engineering and Management
Supervisor PROF. Avigdor Gal
Full Thesis textFull thesis text - English Version


Business process management (or BPM) became an inseparable part of every major organization today. It is based on the observation that each product or service that a company provides is the outcome of a number of activities preformed. Business processes allow to organize and to better understand those activities and their relationships.

A business process consists of a set of activities executed in coordination within an organization that lead to a specific end. This process is actually often a collection of interrelated processes that function in a logical sequence to achieve some ultimate goal.

As organizations develop and reach their maturity stage, the amount of processes grows larger, and the need to be able to match process models becomes more crucial.

Comparison of process models typically involves the identification of correspondences between activities, which is supported by matching techniques. Textual heterogeneity and differences in model granularity are major challenges for matching process models. The former relates to inconsistent usage of activity labels, e.g., different labeling styles and linguistic phenomena such as synonymy. The latter relates to refinements of activities, e.g., correspondences that have to be defined between sets of activities instead of single activities.

In this thesis, we present a matching technique that is tailored towards process models that feature textual descriptions of activities. Using ideas from language modeling in Information Retrieval, our approach leverages descriptions to identify correspondences between activities. To this end, we define a novel language model that combines both passage-based and positional-based language models. We harness it in order to create a new approach to judge the similarity of activities and derive correspondences.

Our large scale evaluation sets, with real-world process models, indicate that the presented technique is able to identify correspondences that are not found by existing approaches.