טכניון מכון טכנולוגי לישראל
הטכניון מכון טכנולוגי לישראל - בית הספר ללימודי מוסמכים  
M.Sc Thesis
M.Sc StudentBernstein Andrey
SubjectAdaptive State Aggregation for Reinforcement Learning
DepartmentDepartment of Electrical Engineering
Supervisor Professor Nahum Shimkin
Full Thesis textFull thesis text - English Version


Abstract

Reinforcement Learning (RL) is a promising computational approach for constructing autonomous systems that improve their performance with experience. Its applications range from robotics, through industrial manufacturing and scheduling, to combinatorial search problems such as board games.


For “small” problems, there exist efficient RL algorithms, with formal guarantees and polynomial learning rate. However, these algorithms are infeasible in cases where the state and/or action spaces are very large or infinite, since their time and space complexity is typically polynomial in the size of the space. On the other hand, most of the existing algorithms for solving “large” problems are heuristic in nature, without formal guarantees.


In this thesis we propose new algorithms that are aimed to solve the online, continuous state space reinforcement learning problem, with provably efficient exploration of the state space. The proposed algorithms use an adaptive state aggregation approach, going from coarse to fine grids over the state space, which enables to use finer resolution in the “important” areas of the state space, and coarser resolution elsewhere. We consider an on-line learning approach, in which we discover these important areas on-line, using a confidence intervals exploration technique. Polynomial learning rates (in terms of sample complexity) are established for these algorithms.