Exploiting sparsity in large-scale optimization
Lead Research Organisation:
Science and Technology Facilities Council
Department Name: Scientific Computing Department
Abstract
Optimization seeks to find the best combination of parameters that will result in the best possible outcome according to some given criterion. More precisely, a mathematical optimization problem consists of a set of defining parameters that describe the system, together with an objective that depends on the parameters and determines how well the system is performing. In addition, there may be (and often are in practice)
conditions that constrain the allowable parameter values.
Optimization problems arise everywhere in quantitative disciplines, ranging from computer science and engineering to operations research and economics. Common everyday applications include minimizing production costs, maximizing profits, minimizing required resources, and so on. The essential need for optimization is highlighted by EPSRC theme areas in engineering, including control, mechanical, process and water engineering and instrumentation. Because the need to solve optimization is so widespread, improvements in performance (in terms of time and/or reliability) of optimization algorithms have potentially far-reaching benefits for research, industry, finance, healthcare and beyond.
In many practical situations, the number of parameters is large, making the optimization problem challenging to solve. However, it is frequently possible to structure the problem so that the interactions between parameters are local, that is, each parameter is only directly involved with a small number of other parameters. This leads to what is known as sparsity. Many large-scale optimization problems are naturally sparse and for efficiency it is essential that the sparsity is used in the development of solution algorithms. Indeed, despite the availability of powerful modern computers, unless sparsity is exploited, many problems are computationally intractable.
Optimization algorithms often require access to first and second derivatives of functions within the mathematical model. Unfortunately, it can be difficult to provide the second derivatives needed to obtain so-called Hessian matrices. The aim of this project is to improve the efficiency and reliability of large-scale optimization by proposing a new approach for constructing approximations to Hessian matrices. Central to this will be formulating the problem as a very large sparse system of linear equations, which may be over-determined or under-determined (that is, there may be more equations than unknowns or fewer equations than unknowns). This formulation will allow the exploitation of ideas from sparse numerical linear algebra.
The outcomes of the project will not only be new algorithms and theory but also implementations in high-quality software that will be fully supported and maintained as part of our internationally-renowned mathematical software libraries. This will enable wide take-up of the new ideas, well beyond the mathematics research community.
conditions that constrain the allowable parameter values.
Optimization problems arise everywhere in quantitative disciplines, ranging from computer science and engineering to operations research and economics. Common everyday applications include minimizing production costs, maximizing profits, minimizing required resources, and so on. The essential need for optimization is highlighted by EPSRC theme areas in engineering, including control, mechanical, process and water engineering and instrumentation. Because the need to solve optimization is so widespread, improvements in performance (in terms of time and/or reliability) of optimization algorithms have potentially far-reaching benefits for research, industry, finance, healthcare and beyond.
In many practical situations, the number of parameters is large, making the optimization problem challenging to solve. However, it is frequently possible to structure the problem so that the interactions between parameters are local, that is, each parameter is only directly involved with a small number of other parameters. This leads to what is known as sparsity. Many large-scale optimization problems are naturally sparse and for efficiency it is essential that the sparsity is used in the development of solution algorithms. Indeed, despite the availability of powerful modern computers, unless sparsity is exploited, many problems are computationally intractable.
Optimization algorithms often require access to first and second derivatives of functions within the mathematical model. Unfortunately, it can be difficult to provide the second derivatives needed to obtain so-called Hessian matrices. The aim of this project is to improve the efficiency and reliability of large-scale optimization by proposing a new approach for constructing approximations to Hessian matrices. Central to this will be formulating the problem as a very large sparse system of linear equations, which may be over-determined or under-determined (that is, there may be more equations than unknowns or fewer equations than unknowns). This formulation will allow the exploitation of ideas from sparse numerical linear algebra.
The outcomes of the project will not only be new algorithms and theory but also implementations in high-quality software that will be fully supported and maintained as part of our internationally-renowned mathematical software libraries. This will enable wide take-up of the new ideas, well beyond the mathematics research community.
Publications
Fowkes J
(2023)
Approximating sparse Hessian matrices using large-scale linear least squares
in Numerical Algorithms
| Description | Optimization problems are ubiquitous. Large-scale optimization algorithms frequently require sparse Hessian matrices that are not readily available. Existing methods for approximating large sparse Hessian matrices either do not impose sparsity or are computationally prohibitive. To try and overcome these limitations, in this project we proposed novel approaches that seek to satisfy as many componentwise secant equations as necessary to define each row of the Hessian matrix. Test problems from real applications have been used to demonstrate the effectiveness and robustness of the new algorithms, which have also been incorporated into our optimization software, that is freely available. |
| Exploitation Route | The new algorithms are now apart of the GALAHAD software library and so they can be used by others. |
| Sectors | Other |
| Title | Modern Fortran package SHA available as part of the GALAHAD library |
| Description | The SHA package finds a component-wise secant approximation to the Hessian matrix using values of the gradient of the objective function. More specifically, given iterate and gradient differences, the package aims to find an approximation to the Hessian for which the component-wise secant conditions hold for a chosen set of past iterates. The methods provided in SHA for this purpose take advantage of the entries in the Hessian that are known to be zero. The SHA package is particularly intended to allow gradient-based optimization methods to build a suitable approximation to the Hessian. This then gives the method an opportunity to accelerate the iteration using the generated Hessian approximation. |
| Type Of Technology | Software |
| Year Produced | 2024 |
| Open Source License? | Yes |
| Impact | SHA has been incorporated into the GALAHAD optimization library and can thus be used by the large number of optimization solvers present in GALAHAD. |
| URL | https://github.com/ralna/GALAHAD |
