Optimal Stopping with Unknown Gain Function

Lead Research Organisation: University of Warwick
Department Name: Statistics

Abstract

In the optimal stopping problems the form of the gain function is often unknown. One of the possible solutions to this is to employ the approach of imitation learning to infer the gain function from the expert's demonstrations. Imitation learning is considered as a branch of Reinforcement Learning which recently proved to be a useful in solving the optimal stopping and/or optimal control problems. The aim of the Project is to develop a mathematically backed framework to solving the optimal stopping problems with an unknown gain function (inverse optimal stopping). The objectives include
- Establishing a Reinforcement Learning formulation of the general as well as the inverse optimal stopping and optimal control problems;
- Identifying the main pitfalls of the existing approaches to the optimal stopping/optimal control problems;
- Developing a Reinforcement Learning algorithm to efficiently and effectively solve the inverse optimal stopping problems and establishing mathematical guarantees for it;
- Creating an application of the developed algorithms with a potential to be used in the autonomous vehicles control.
To the best of our knowledge there is a limited literature available on the topic of optimal stopping problems with an unknown gain function and the existing research in the area mainly covers the applications of the existing algorithms without in-depth mathematical proofs and guarantees of the algorithm's convergence and stability.
The project aligns with the EPSRC remit covering the area of Mathematical Sciences and is closely related to the activities conducted by the AI and Robotics team of the Research Council.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/T51794X/1 01/10/2020 30/09/2025
2585636 Studentship EP/T51794X/1 04/10/2021 31/03/2025 Anna Kuchko