Quantitative Termination Approximation Analysis of One-Counter Markov Decision Processes

Lead Research Organisation: University of Edinburgh
Department Name: Sch of Informatics

Abstract

A Markov Decision Process(MDP) is a stochastic, controlled, discrete-time process, which is played over a transition graph. A player controls a specified subset of states, choosing their own strategy on how to pick transitions and where to move to next from those states. The rest of the states fall under nature/uncertainty's possession, with pre-defined probability distributions over outgoing transitions. Adding an unbounded counter, that changes by at most 1 at each step of the game, builds the class of one-counter(OC)-MDPs.

This project looks at the analysis of the so-called termination objective over this game, which is the probability of reaching counter value 0 at any state, provided a beginning configuration of given state and counter>0. Depending on whether it is a maximization or minimization OC-MDP, the goal of the player is to maximize or minimize, respectively, this probability.

It has previously been proven that the qualitative decision part of this objective, i.e. "is optimal probability 1 or 0?", is decidable in polynomial time. As for the quantitative part, i.e. "is termination value at least p?", it has been shown that the question of whether such problems are computable is open and that computing an approximation of the optimal game value and (epsilon-)optimal strategies can be done in time exponential in the size of the game. The focus of this research aims at improving the complexity from exponential to polynomial time for such games and the termination approximation problem.

Achieving this has its benefits, since one-counter games are applicable elsewhere. For example, the classes of discrete-time Quasi-Birth-Death processes and Solvency games are subsumed by OC-MDPs, meaning this type of process is widely used - in queueing theory and investment/gambling.

As a milestone, this problem is the focus for this first year. After that, there are many other game-theory-related and/or MDP-related problems that could be analysed.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/N509644/1 01/10/2016 30/09/2021
1789473 Studentship EP/N509644/1 01/10/2016 30/08/2020 Emanuel Martinov
 
Description Initially, I studied my starting PhD question, namely searching for a polynomial time algorithm for the Quantitative Termination Approximation problem for One-Counter MDPs. It is a very interesting open question, with an already known exponential time algorithm for it. Some observations were made, further showing the difficulty of finding an answer to the question.

Then I moved to Branching Processes (BPs) and to problems related to this model. In particular, polynomial time algorithms were given for deciding almost-sure and limit-sure reachability in Branching Concurrent Stochastic Games (BCSGs). These are a class of infinite-state imperfect information stochastic games that generalize both finite-state concurrent stochastic reachability games, as well as branching simple stochastic reachability games. The paper, containing the results, is a joint work with Kousha Etessami, Alistair Stewart and Mihalis Yannakakis and appeared in ICALP'19.

Afterwards, I worked on the analysis of Ordered Branching Markov Decision Processes, which is a new model that I and my supervisor, Kousha Etessami, introduced, under the qualitative version of the multi-objective reachability problem, which is a natural extension to the previously studied qualitative reachability problem. This work was published in a paper in RP'20.

In my third year of the PhD, I also immersed myself into another area of game theory, namely coalitional games. I obtained results (not yet published) in the sub-areas of coalition formation games and transferable-utility coalitional games.

I have already submitted and defended my PhD thesis.
Exploitation Route The findings related to the models of Branching Processes and Ordered Branching Processes are all work towards obtaining broader and generalised results for model checking for (O)BMDPs, (O)BSSGs, (O)BCSGs. So the findings can certainly be of use to people working in this area of interest.

The findings related to coalitional games are another stepping stone towards analysing different notions of stability in such multi-player games, which has had increasing interest in the past 15-20 years. Such analysis can contribute to other industries utilizing such games, which is the reason why people are investigating it.
Sectors Aerospace, Defence and Marine,Chemicals,Creative Economy,Digital/Communication/Information Technologies (including Software),Education,Electronics,Energy,Environment,Financial Services, and Management Consultancy,Healthcare,Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology

 
Description SICSA Research Scholar
Amount £400 (GBP)
Organisation SICSA Scottish Informatics and Computer Science Alliance 
Sector Academic/University
Country United Kingdom
Start 07/2019 
End 07/2019