Ph.D Thesis

Ph.D StudentTamar Aviv
SubjectRisk-Sensitive and Efficient Reinforcement
Learning Algorithms
DepartmentDepartment of Electrical and Computer Engineering
Supervisor PROF. Shie Mannor


Reinforcement learning (RL) is a computational framework for sequential decision-making, which combines control methods with machine-learning techniques, and is the state-of-the-art for solving large-scale decision problems. Many real-world decision problems involve uncertainty, either due to noise in the system dynamics or due to parameter uncertainty about the model. In the standard RL formulation, such uncertainty is handled by considering as an objective the expected return. In this work, we pursue a more versatile approach towards uncertainty, and extend the RL methodology to take into account the risk of the return. For uncertainty due to noisy dynamics, we consider several risk-measures of the return, including mean-variance formulations, conditional value-at-risk, and coherent risk measures. We extend the policy-gradient RL approach to such risk-sensitive objectives. For parameter uncertainty, we extend the robust Markov decision process formulation to the RL setting, using function approximation. Thereby, our approach allows handling modeling errors in large or continuous decision problems.