Optimal Stopping with Unknown Gain Function
Lead Research Organisation:
University of Warwick
Department Name: Statistics
Abstract
In the optimal stopping problems the form of the gain function is often unknown. One of the possible solutions to this is to employ the approach of imitation learning to infer the gain function from the expert's demonstrations. Imitation learning is considered as a branch of Reinforcement Learning which recently proved to be a useful in solving the optimal stopping and/or optimal control problems. The aim of the Project is to develop a mathematically backed framework to solving the optimal stopping problems with an unknown gain function (inverse optimal stopping). The objectives include
- Establishing a Reinforcement Learning formulation of the general as well as the inverse optimal stopping and optimal control problems;
- Identifying the main pitfalls of the existing approaches to the optimal stopping/optimal control problems;
- Developing a Reinforcement Learning algorithm to efficiently and effectively solve the inverse optimal stopping problems and establishing mathematical guarantees for it;
- Creating an application of the developed algorithms with a potential to be used in the autonomous vehicles control.
To the best of our knowledge there is a limited literature available on the topic of optimal stopping problems with an unknown gain function and the existing research in the area mainly covers the applications of the existing algorithms without in-depth mathematical proofs and guarantees of the algorithm's convergence and stability.
The project aligns with the EPSRC remit covering the area of Mathematical Sciences and is closely related to the activities conducted by the AI and Robotics team of the Research Council.
- Establishing a Reinforcement Learning formulation of the general as well as the inverse optimal stopping and optimal control problems;
- Identifying the main pitfalls of the existing approaches to the optimal stopping/optimal control problems;
- Developing a Reinforcement Learning algorithm to efficiently and effectively solve the inverse optimal stopping problems and establishing mathematical guarantees for it;
- Creating an application of the developed algorithms with a potential to be used in the autonomous vehicles control.
To the best of our knowledge there is a limited literature available on the topic of optimal stopping problems with an unknown gain function and the existing research in the area mainly covers the applications of the existing algorithms without in-depth mathematical proofs and guarantees of the algorithm's convergence and stability.
The project aligns with the EPSRC remit covering the area of Mathematical Sciences and is closely related to the activities conducted by the AI and Robotics team of the Research Council.
Organisations
People |
ORCID iD |
Sigurd Assing (Primary Supervisor) | |
Anna Kuchko (Student) |
Studentship Projects
Project Reference | Relationship | Related To | Start | End | Student Name |
---|---|---|---|---|---|
EP/T51794X/1 | 30/09/2020 | 29/09/2025 | |||
2585636 | Studentship | EP/T51794X/1 | 03/10/2021 | 30/03/2025 | Anna Kuchko |