Overcoming the curse of dimensionality in dynamic programming by tensor decompositions

Lead Research Organisation: University of Bath

Department Name: Mathematical Sciences

Abstract

Almost 60 years ago Richard Bellman coined the expression ``curse of dimensionality'' when referring to the overwhelming computational complexity associated with the solution of multi-stage decision processes through dynamic programming (DP), leading to the well-known Bellman equation.
Nowadays the curse of dimensionality has become an ubiquitous expression shared in different fields such as numerical analysis, compressed sensing and machine learning. However, it is in the computation of optimal feedback laws for the control of dynamical systems where its meaning continues to be most evident.
Consider a simple pendulum, which is characterised by two variables, the position of the mass, and its velocity.
A classical demonstration is the stabilisation of the pendulum in the unstable upwards position by moving the base adaptively.
To synthesize the control signal for the actuator of the base that would be robust to stochastic fluctuations (e.g. from the air),
one needs to compute a feedback map as a two-dimensional function of the position and velocity.
This feedback map satisfies the two-dimensional partial differential equation (PDE), which has no analytic solution in general.
Solving it numerically requires a discretisation of both the position and velocity with some n points.
However, the total number of unknowns in the feedback map is n^2, corresponding to all combinations of the position and velocity points,
and therefore the computations take longer.

For d variables, the complexity of the straightforward numerical solutions grows exponentially as n^d.
Even a simple quadrocopter model is described by 12 variables, and modern applications in particle physics or opinion dynamics require optimal control of dynamical systems, which are in turn PDEs, discretised with hundreds or thousands of variables.
However, such systems are often very special in the sense that each variable is driven effectively by only neighbouring variables in the dynamics.
In this case, the sought feedback map admits a tensor approximation with (almost) separated variables.
This allows us to reduce the computational complexity dramatically, to a number of operations growing only polynomially with the number of variables, albeit at a price of introducing some error.
This error can be further corrected by using methods from reinforcement learning, similar to those that made the winning artificial Go player.

Funded Value:

£202,430

Funded Period:

May 21 - May 23

Funder:

EPSRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

EP/V04771X/1

Principal Investigator:

Sergey Dolgov

Research Subject:

Mathematical sciences (100%)

Research Topic:

Non-linear Systems Mathematics (100%)

Organisations

University of Bath (Lead Research Organisation)

People	ORCID iD
Sergey Dolgov (Principal Investigator)
Dante Kalise Balza (Co-Investigator)	http://orcid.org/0000-0003-2327-1957

Publications

Author Name Title

Publication Date Published

|< < 1 2 > >|

10 25 50

Antil H (2023) State-constrained Optimization Problems under Uncertainty: A Tensor Train Approach

Alla A (2023) State-dependent Riccati equation feedback stabilization for nonlinear PDEs in Advances in Computational Mathematics

Dolgov S (2021) Tensor Decomposition Methods for High-dimensional Hamilton--Jacobi--Bellman Equations in SIAM Journal on Scientific Computing

Wells C (2022) The role of airspeed variability in fixed-time, fuel-optimal aircraft trajectory planning in Optimization and Engineering

Antil H (2022) TTRISK: Tensor train decomposition algorithm for risk averse optimization in Numerical Linear Algebra with Applications

Dutta R (2021) Using mobility data in the design of optimal lockdown strategies for the COVID-19 pandemic. in PLoS computational biology

Key Findings
Software and Technical Products


Description	Computational algorithms are developed for model reduction and optimal feedback control synthesis for large-scale dynamical systems, which can be nonlinear and stochastic. The methods have been tested on consensus stabilization in the interacting particle swarm and vorticity minimisation in fluid dynamics. In the latter in particular, the resulting control signal has provided a more laminar flow compared to state of the art feedback control methods.
Exploitation Route	The control function precomputed by the developed algorithms allows fast evaluation of the control signal at the given state of the system. This can be programmed into a low-performance hardware controlling e.g. fluid flows in real time.
Sectors	Digital/Communication/Information Technologies (including Software) Electronics
URL	https://github.com/saluzzi/TT-Gradient-Cross


Title	TT-Gradient-Cross
Description	Tensor Train (TT) Gradient Cross algorithm for the resolution of Hamilton Jacobi Bellman equation.
Type Of Technology	Software
Year Produced	2022
Open Source License?	Yes
Impact	none so far
URL	https://github.com/saluzzi/TT-Gradient-Cross


Title	TT-HJB
Description	Matlab package to form and solve Hamilton-Jacobi-Bellman (HJB) equations for the optimal feedback control synthesis for stochastic dynamical systems.
Type Of Technology	Software
Year Produced	2020
Open Source License?	Yes
Impact	No notable impacts traced yet, but we have received a few requests from PhD students working on similar problems who would like to build on our code.
URL	https://github.com/dolgov/TT-HJB

Abstract

Organisations

People

ORCID iD

Publications