A probabilistic toolkit to study regularity of free boundaries in stochastic optimal control

Lead Research Organisation: University of Leeds
Department Name: Statistics

Abstract

Imagine a spaceship that travels towards a planet and must reach it by a given date. After the launch, and in absence of further intervention, the change in relative position of the spaceship and the planet cannot be predicted with 100% accuracy. The trajectory must be constantly monitored and unforeseen variations must be accounted for, in order to reach the target. However, several constraints must be considered: for example availability of fuel and effectiveness of intervention (if the spaceship is too far off the target, late interventions will not bring it back to the desired trajectory). The question is therefore how to strike the right balance between costs and benefits in order to control the trajectory of the spaceship in the optimal way.

This problem was formulated as one of "stochastic control" in the 60's by J. Bather and H. Chernoff. A quick search in the NASA Technical Reports Server shows that "stochastic control theory" is at the core of aerospace engineering (6,843 matching records). Interestingly this exciting branch of mathematics finds applications in many other real-world problems including physics, biology, energy and economics.

To give an oversimplified idea of what a solution may look like in the problem above, we could say that it is optimal to make a "small" adjustment to the spaceship's trajectory each time that the offset between spaceship and planet exceeds a value that depends on the available fuel and on the time elapsed from launch. This critical value is called the "free boundary", in mathematics, and it is the key unknown quantity in most stochastic control problems.
In practice, the shape and smoothness of the free boundary (as a function of time and fuel, in the example), are needed to enable efficient (numerical) evaluation of the spaceship's optimal trajectory (e.g. leading to minimal use of fuel). Associated to each control problem there is indeed a "cost function" which measures the performance of the control strategy. In the example this may be taken as proportional to the distance of the spaceship from the target, plus the cost of using fuel. The aim of the controller is to minimise the expected value of such cost. When the optimal control is implemented the resulting expected cost is called "value function". The value function is the other main unknown object in this context and, along with the free boundary, their study goes under the name of "free boundary problem" (FBP).

FBPs are addressed in Analysis and in Probability theory. However it is often impossible to find a full analogy between results in the two fields. On the one hand, Probability can only explain very limited smoothness of the free boundary and of the value function, but is flexible in modelling randomness in the system. On the other hand, Analysis obtains fine regularity results but mostly under rather inflexible assumptions on the model. In our example above, PDE theory gives very accurate optimal controls if the spaceship's trajectory is described by a simple model. However, in practice engineers must deal with a wide class of random perturbations and need a versatile probabilistic approach. The latter must be supported by a refined probabilistic understanding of FBPs, which is the objective of this proposal.

In this project I will develop a new framework for the study of free boundary problems that will hinge on properties of random noises drawn from the class of diffusion processes. My work will provide advanced tools that not only will unify the probabilistic and analytic approach to stochastic control but, more importantly, will remove some of the long-standing assumptions in both areas and allow for tractable solutions to a whole new class of applied problems. Moreover, I will obtain ODEs to accurately compute optimal strategies in multi-dimensional settings (see 2.5 in Case for Support). Non-linear integral equations, currently used, cannot be computed efficiently in dimension higher than two.

Planned Impact

This proposal addresses fundamental theoretical questions whose main beneficiaries are likely to be in the academia. However the project will lead to an overall improvement in the design of rather general, optimally controlled systems that feature often in applied problems.

In the medium term (up to 10 years), I will focus on generating tangible impact outside of academia through collaborations with other experts (see also Pathways to Impact). The primary area of interest is energy systems (ES) but I will keep an open mind and continue to explore opportunities for further applications in other disciplines. Potential directions of work may be identified in, e.g., public health (PH) and big-data (BD), among others.
All three themes have high societal and economic relevance both in the UK and internationally.

(ES) A major challenge in electricity markets is to combine conventional generation, renewables and storage, with the aim to efficiently meet the demand and maintain stability of the network. A likely beneficiary of improved control models for storage is National Grid who has recently closed an auction to subsidise back-up capacity from a large network of storage facilities. "The 500 megawatts of new storage projects procured in the auction [...] is by far the biggest uptake of battery technology so far by National Grid" (A. Ward, Energy Editor of Financial Times, 6 Dec. 2016). Storage is used to maintain the system's operating frequency of 50Hz and, according to C. O'Hara (Director of UK System Operator) "This is the beginning of an exciting new chapter for the industry" (National Grid, press release 26 Aug. 2016). It is therefore realistic that stochastic control models will flourish in this area in the next decades.
Households in the UK will also benefit from applications of efficient algorithms to energy storage because "Enhanced ability to control variations in frequency [...] will result in reduced costs of approximately 200 million pounds [...] meaning reduced costs for the end consumer" (National Grid, press release 26 Aug. 2016). Moreover, the use of efficient storage will also have long term impact on the quality of the environment because of a reduction in generation from diesel.

(PH) Stochastic models may be applied to detect and control the spread of infectious diseases (e.g., by facilitating early detection of a new outbreak and by guiding the distribution of treatments). Studies in this direction would benefit the NHS by reducing pressure on staff while improving cost-effectiveness of patients' treatments. Economic and societal effects of improved health care would be perceived more broadly by the population in the UK.
M.J. Keeling and J.V. Ross, J. R. Soc. Interface (2008), 5: 171-181 show that the theory of Markov processes allows to overcome difficulties arising in "traditional stochastic simulations" and "provide valuable insights" in modelling disease dynamics. They also notice that, so far, such Markovian models have been overlooked by applied researchers in the area. Therefore a systemic approach based on controlled Markov processes seems both promising and timely.

(BD) Stochastic control (combined with "filtering") may help optimising learning algorithms from noisy data-sets, while big-data technology may feed into control models for real world applications. Data-driven control models are needed for rapid detection of disease outbreaks (D.L. Buckeridge et al., J. Biomedical Informatics, 2005, 38(2):99-113) and they may be used by PHE in the UK (this is also linked to (PH) above).
Retailers, social networks, governing bodies may use control models integrated with big-data to optimise the release of new products and policies, based on the public's reactions to new information. This problem relates to the "goodwill problem", in control theory, dating back to M. Nerlove, J.K. Arrow, Economica, 1962, 29:129-142.
 
Description I provided probabilistic proofs of some key properties of solutions to free boundary problems stemming from optimal stopping problems. For example, I obtained conditions under which the solution of a free boundary problem is continuously differentiable in the whole space and the free boundary is Lipschitz. Moreover, I established links across free boundary problems, with gradient constraint and Dirichlet boundary conditions, and optimal stopping problems on multi-dimensional diffusions with reflection and non-standard boundary behaviour.
Exploitation Route The regularity results established so far can be used, in combination with properties of diffusion processes, to find fine properties of optimal stopping rules, that could then lead to better numerical methods for optimal stopping/control problems. Moreover, these methods could facilitate the analysis of strategic games of control and stopping, which find several real-world applications in economics and power systems management.
Sectors Aerospace, Defence and Marine,Energy,Financial Services, and Management Consultancy

URL https://sites.google.com/site/tizianodeangelis/home/publications-and-preprints