|M.Sc Student||Shiftan Yuval|
|Subject||Car Allocation in car Deficient Households: Discrete Choice|
Analysis and Applying Random Forest to Discrete
Choice Model Building
|Department||Department of Civil and Environmental Engineering||Supervisor||Professor Shlomo Bekhor|
|Full Thesis text|
This thesis investigates the choice behavior of car allocation among household members. It focuses on car deficient households, defined as households with at least one vehicle, and with more licensed drivers than cars. Methodologically, the thesis suggests a novel approach to discrete choice model estimation, which we term a methodological-iterative (MI) approach. This approach utilizes the results of the random forest classifier (i.e., feature importance) to incorporate recursive feature elimination to utility function specification and estimation.
The MI method reduces estimation time and can reveal to the modeler significant explanatory variables she may overlooked or relationships among variables she did not consider. It is proposed as a hybrid tool of data-driven machine learning and a theory-driven discrete choice model to obtain the best possible model, and implemented in this thesis to investigate car allocation behavior. As such, it harnesses some of the advantages of a powerful data-driven machine learning classifier while obtaining the interpretability of discrete choice models.
The thesis first estimates several discrete choice models independent of the MI method, a multinomial logit, two nested logits, and a cross-nested logit. Then, a decision tree is estimated, followed by a random forest classifier, which is an ensemble of decision trees, and finally, the MI discrete choice model (MI-DCM), which incorporates the results of the random forest. All models are then compared by performance and validated using cross-validation, with an emphasis on the comparison between the independent DCM and the MI-DCM. This study is based on a comprehensive, state of the art household travel habit survey performed in the Tel Aviv metropolitan area.
Estimation results include several significant explanatory variables, explaining which of the household members will drive the vehicle, and what kind of considerations will enter the decision process. These include household attributes, household member attributes, and characteristics of the household member’s daily travel pattern. The results indicate that the car is allocated according to household members’ needs and family hierarchy, which is affected by the adult’s gender, and the status of the family member within the household. The nested and cross nested structures further invoke the importance of family hierarchy, as well as coordination considerations among family members.
The MI discrete choice model captures more explanatory variables, shows a slightly better classification performance on an independent data set, obtains similar likelihood results, and replicates the same cross-nested structure, adding to the robustness of the results. While the performance of the MI model is similar to logit models that are estimated in the traditional way of trial and error, it significantly reduces the model building time and may prove helpful when the modeler lacks the behavioral insight of the dataset or the phenomenon being investigated.