טכניון מכון טכנולוגי לישראל
הטכניון מכון טכנולוגי לישראל - בית הספר ללימודי מוסמכים  
M.Sc Thesis
M.Sc StudentParnas Sigalit
SubjectAnalysis of Hierarchical Count Data
DepartmentDepartment of Industrial Engineering and Management
Supervisor Ms. Ayala Cohen (Deceased)
Full Thesis text - in Hebrew Full thesis text - Hebrew Version


Abstract

The motivation for this work was a previous study that was done by Dr. Zehava Rosenblatt from the School of Education at the University of Haifa. Her research examined the relationships between several factors and teachers’ absence. The data for this research that were collected by the Ministry of Education in 2001 pertains to more than 60,000 teachers from 2120 schools. These are hierarchical count structured data that included personal information on the teachers, as well as on their schools. Because of the hierarchical structure of the data, relatively complex models were fitted, which were difficult to understand by non- statisticians.

In this thesis we fitted two models to the data using the HLM software. Graphical displays were constructed using additional software, such as R, Matlab, SAS and Excel.

The first model was based on the assumption that a square root transformation of the count data can be assumed to be normally distributed. Several graphical ways were applied to present the results of this model, some of which utilized the Non central Chi Square distribution. The second model was a GLM (Generalized Linear Model), which is applicable for exponential family of distribution, and in particular for the Poisson distribution, which we fitted, allowing for over dispersion. Graphical methods were used again to present the results of this model, and we utilized the Negative Binomial Chi Square distribution to summarize part of the results. Results based on both models, were also obtained and displayed for various defined sub- groups of teachers, which characterized specific profiles.

The main purpose of this thesis was to suggest tools, mainly graphical, for a clearly understood presentation of the findings, particularly for non-statisticians.