טכניון מכון טכנולוגי לישראל
הטכניון מכון טכנולוגי לישראל - בית הספר ללימודי מוסמכים  
M.Sc Thesis
M.Sc StudentHorosov Eliyahu
SubjectMultiple Imputation Methods - Properties and Applications
DepartmentDepartment of Industrial Engineering and Management
Supervisor Ms. Ayala Cohen (Deceased)
Full Thesis text - in Hebrew Full thesis text - Hebrew Version


Abstract

Missing data are nuisance rather than a primary focus, but handling it in a principled manner raises conceptual difficulties and computational challenges.

Unfortunately, ad hoc edits may do more harm than good, producing answers that are biased, inefficient and unreliable. Our thesis goals were examining methods to deal with the missing data problem, and to present the advantages and disadvantages of each method. We developed and implemented forecasting and Multiple Imputation (MI) models on our database, and analyzed the results.

In the first chapter we covered methods for dealing with the missing data problem. We explained that MI is a general term for analyzing and completing data in a database, with reasonable values.

In the second chapter we introduced the database which included daily raining amounts, and was collected by the Israeli Meteorological Service.  The main part of the problem was dealing with the missing data, another part was having a lot of non rainy days.

In the third chapter we analyzed and implemented several forecasting methods.

We used the results to develop imputation models that fitted our data in the best way.

In the fourth chapter we developed two complicated MI models, which were built using SAS, based on the conclusions from the forecasting analyses.

The first model, the Lag model, included the delayed data as additional variables for the imputation. The innovation of the model was in using the future data for completing the missing values.

The second model, the Winters model, was using each winter as a variable in the imputation process. The Lag model had good results, however the relation between the variables got weaker as we used a future which is faraway. In the Winters model, we implemented backward and forward forecasting using the MI at the same time. The Winters model was more complicated, and enabled to use more information for the imputation.

At the next stage, we used another example to explain about the statistics, and to investigate the influence of the data transformation on the results.

In the last part of the chapter we presented additional analyses of MI. we analyzed the percent of missing values influence on the MI results, the influence of MI implementation on the correlations between different variables and stations, and the influence of the MI implementation on non rainy days.

At the end we summarized the research including its significant results, and suggested directions for future researches.