M.Sc Thesis

M.Sc StudentKustanovich Zeev
SubjectPrediction with Limited Training Data and Applications
DepartmentDepartment of Electrical and Computer Engineering


The problem of prediction with limited training data is investigated. Based on recent work of Ziv, which states both non-asymptotic lower bounds and universal prediction algorithm for every stationary source, the question of the optimal prediction schemes and their applications to the data compression is raised and explored.

     Optimal non-asymptotic universal prediction algorithms are extremely important for compression of finite length sequences with rich alphabet (texts, images…).

     In his work, Ziv defined a class of admissible predictors. This class includes sequential predictors that use prediction with respect to a context. The contexts are limited to phrases that appeared in the past (sliding window prediction algorithms) or in the independent database (fixed database prediction algorithms).

     In first part of our work, universal prediction algorithm is studied and simulated in non-asymptotic regime. Some parameters like training data length, number of prefix appearances and order of Markov source are investigated and, the question of their influence on universal prediction is explored.

     The second part of this work deals with applications of Ziv’s universal prediction algorithm to some data compression schemes. This part inspects the scheme for compression of sources that is based on coding the error sequence. In binary case, bits in the error sequence are equal to zero if prediction is made correctly and equal to one if prediction is made incorrectly. Motivation for this scheme is based on the fact that assuming good prediction, the entropy of error sequence is much less than of the original one.