טכניון מכון טכנולוגי לישראל
הטכניון מכון טכנולוגי לישראל - בית הספר ללימודי מוסמכים  
M.Sc Thesis
M.Sc StudentPreis Amitzur
SubjectA Machine Learning Model for Quantity-Quality flow
Predictions in Watersheds
DepartmentDepartment of Civil and Environmental Engineering
Supervisor Professor Avi Ostfeld


Abstract

This thesis presents a hybrid Machine Learning model for quantity-quality flow predictions in watersheds. The hybrid algorithm consists of two Machine Learning methods: Model Trees and Genetic Algorithms. The proposed algorithm combines the two methods in an attempt to develop a fast, accurate, and effective predictive model.
Commonly used modeling techniques are physically-based models to assess non point source loads in watersheds. The approach presented herein is based on data availability. In case of extended records of rainfall, climatic data, land utilization data, and flow, a “black (gray) box” (data driven) technique can be implemented rather than a physically-based model. A data-driven model (DDM) is a model that couples the system’s state variables (input, internal, and output) without much knowledge of its “physical” behavior. A model tree (MT) is the leading algorithm used in this work. The algorithm (Quinlen, 1993) builds rule-based predictive models, which output continuous values as classifiers. A Genetic Algorithm (GA) is a search method based on the natural selection and natural genetics of Darwin’s evolution principle. The basic idea of the hybrid MT-GA algorithm is to create a set of parameters for the model tree attributes. The range of parameters is required to be specified according to the physical properties of the problem. This way the genetic algorithm optimizes the parameters in order to improve the predictions made by the model tree. The hybrid algorithm was applied to Meshushim watershed, which is one of the sub-basins of Lake Kinneret watershed. The thesis contains three applications of the MT-GA algorithm to the Meshushim watershed: rainfall-runoff (predicting daily flow at the watershed outlet), Ntotal (predicting daily Ntotal loads from the watershed), and Ptotal (predicting daily Ptotal loads from the watershed). The model showed promising results for predicting both flow and contaminations as a result of rainfall events.