טכניון מכון טכנולוגי לישראל
הטכניון מכון טכנולוגי לישראל - בית הספר ללימודי מוסמכים  
M.Sc Thesis
M.Sc StudentMoshenberg Shai
SubjectMethods for Imputation and Filtering of Air Pollution Data
DepartmentDepartment of Civil and Environmental Engineering
Supervisor Professor Barak Fishbain
Full Thesis textFull thesis text - English Version


Abstract

Air Quality (AQ) is well recognized as a contributing factor for various physical phenomena and as a public health risk factor. Consequently, there is a need for an accurate cost effective way to measure the level of exposure to various pollutants. Continuous monitoring however, is often susceptible to sensing noise and incomplete, due to measurement errors, hardware problems, low memory or insufficient sampling frequency. In this research two mechanisms are introduced to overcome the described problems, imputation and filtering.

The discrete sampling theorem is detailed for the task of imputing missing data in longitudinal air-quality time series and compared with other well used methods. Within the context of the discrete sampling theorem, two spectral schemes for filling missing values are presented - a Discrete Cosine Transform (DCT) and Clustering Single Variable Decomposition (K-SVD) based methods. 

In the evaluation of the suggested imputation methods in terms of accuracy and robustness, it is shown that spectral methods are comparable to other methods when the data is randomly missing and they do have the upper hand when segments of data are missing. The accuracy was evaluated using a complete very long air pollutants time series. The robustness of the imputation method was evaluated by examining its performance with increasing portions of missing data.

For filtering a mathematical model for finding the optimal averaging window size is presented. This method is based on the assumption that while real measured physical phenomenon affects the measurements of all collocated sensors, sensing noise manifests itself independently in each of the sensors. The results show the potential of the method in air quality measurements in term of using less memory and energy for recording while not losing any critical information.