Modelling and Optimisation with Graphs

Lead Research Organisation: University of Glasgow
Department Name: School of Computing Science

Abstract

Optimisation technologies are used to deliver parcels and groceries cheaply and efficiently, to decide who gets a kidney transplant, to identify candidate chemicals for new drugs, to explain the building blocks of life, and to understand disease transmission in livestock. Graphs are the natural way of describing relationships, compatibilities, transport networks, chemical molecules, instructions and interactions in computer programs, and structured patterns; in particular, many important questions ask, in effect, whether certain graphs contain another subgraph, or can be covered by a collection of subgraphs. This project addresses optimisation problems which involve subgraphs, either with other constraints, or with a complex objective. Problem of this nature are common---we will be working with four real-world application areas involving disease transmission in livestock, matching kidney donors to recipients, metabolomics, and computational algebra.

Currently, dealing with these kinds of problems is complex, time-consuming, and requires a specialist. One option would be to select a particular constraint programming (CP), mixed integer programming (MIP), or boolean satisfiability (SAT) solver. These solvers require a problem to be modelled in terms of variables which have domains of values, and constraints between variables; the solver then finds a way of giving each variable a value from its domain, whilst satisfying all the constraints (or the best such way, as defined by some objective function). Different solver technologies have different restrictions upon the types of variables and constraints allowed---for example, SAT solvers support only boolean (true / false) variables and simple logical constraints. None of these technologies support graphs directly, and so the modeller would also have to come up with a way of encoding the graph part of the problem in a way that is compatible with the other constraints and the objective. Unfortunately, even the most experienced modellers are unlikely to make the best choice of solver and encoding on the first attempt, and even after a long development process, current optimisation technologies give performance several orders of magnitude worse than dedicated algorithms when dealing with subgraph problems.

Another alternative would be to implement a simple algorithm manually---but this is time-consuming and error-prone, and without a huge development effort the performance is again unlikely to be sufficient except for very small inputs. To address this, one might try to adapt a state-of-the-art algorithm from the literature to handle side constraints, but this has an even larger development cost: the theories underlying recently published algorithms are specific to certain variants of subgraph problems (e.g. induced or non-induced, sparse or dense, with particular injectivity and labelling rules, ...) and are not immediately compatible with arbitrary side constraints. Also, publicly available implementations of these algorithms are tailored to support scientific investigation, rather than being engineered for use in a production environment.

The aim of this project is to address the difficulties in solving graph-based optimisation problems from two directions. At the high level, we will provide modelling support for working with graphs directly: being able to express graph problems in a high level optimisation modelling language will make models easier to produce, understand, teach, and verify, and will make optimisation and subgraph technologies accessible to a wider academic and industrial audience. At the low level, we will improve solver support for graph-based and hybrid models: co-operation between general purpose optimisation solvers and dedicated subgraph algorithms will deliver better performance and scalability than any one technique can on its own, and will introduce new opportunities for automatic reformulation and constraint inference.

Planned Impact

By providing an open source, flexible subgraphs library, we will allow software engineers and researchers in other disciplines to make use of cutting edge subgraph algorithmics research, which is currently too inflexible for most real-world applications and too costly for most potential users to implement themselves.

By extending an optimisation modelling language and toolkit, we will make powerful techniques for solving optimisation problems involving graphs accessible to non-specialists such as junior developers, scientists, operations researchers, and engineers.

Our scientific and mathematical impact will include discovering the appropriate graph invariants to allow filtering and inference for generalised subgraph problems, extending our understanding of how learning constraint solvers work, advancing the state of the art in automated translation from high to low level models (with a particular emphasis on high level types), and developing new techniques for hybrid multi-solver optimisation.

Finally, we will be working directly to support and enhance research in four existing application areas, with the aim of advancing the state of the art in each area. These applications are:

Disease Transmission in Livestock: Mathematical modelling involving graphs has been key to understanding the spread of many economically important diseases, including the major foot-and-mouth disease outbreak in GB in 2001. Our technology will support and enhance ongoing research in this area, by allowing faster answers to more complex and larger scale policy questions.

Computational Algebra: The GAP system is widely used in mathematics research. Additionally, systems such as SageMath use GAP internally for many calculations. GAP provides the only supported and open-source implementations of many fundamental mathematical algorithms. Thus deploying enhancements to GAP directly benefits academic and industrial users.

Kidney Exchange: David Manlove's work on matching donors to patients who require a kidney transplant has been used by NHS Blood and Transplant since 2008, and has informed UK medical policy. This has enabled at least 134 kidney transplants to date, leading to enormous quality of life improvements for the recipients, as well as substantial economic benefits to the NHS - taking into account the cost of the transplant and the cost saving of a transplant compared to dialysis over a period of ten years. Our research will allow for simpler models for the kidney exchange problem. As well as increasing transparency and decreasing the reliance upon domain experts, this will allow policy questions such as alternative optimality criteria and additional rich constraints to be investigated with much less effort.

Metabolomics: Metabolomics plays an important role in the life sciences, from drug discovery to synthetic biology. Improving metabolite identification would result in metabolomics being more widely used across the life sciences, and would accelerate discoveries from mass spectrometry experiments. This would ultimately have benefits in health, crop efficiency, and food security (through the ability to better engineer new strains). Our pathway to improving the computational side of metabolomics experiments is via Dr Simon Rogers at the University of Glasgow and his collaborators at Glasgow Polynomics. Any improvements to the computational side of this problem will be communicated directly to practitioners.

To ensure that the benefits we provide are not limited to the application areas above, we will engage with technology users. First, we will engage with the CP, optimisation, AI, and combinatorics research communities via conferences and workshops. We will also run tutorials, describing the tools and algorithms developed, and how to use them. Providing open source implementations of our developed technologies is critical to ensuring adoption by researchers both within and beyond computing science, and in industry.
 
Description We have incorporated proof-logging into some constraints and our subgraph solvers. This allows us to prove correctness of implementations and is a significant step towards explainable and dependable AI.

We have designed, implemented and made available a suite of solvers for subgraph problems.
Exploitation Route We have new techniques for robust algorithm design and implementation which are broadly applicable.
Sectors Digital/Communication/Information Technologies (including Software)

URL http://github.com/ciaranm/glasgow-subgraph-solver
 
Title ciaranm/certified-constraint-solver: AAAI2020 
Description This version of the code is associated with the AAAI 2020 paper "Justifying All Differences Using Pseudo-Boolean Reasoning". 
Type Of Technology Software 
Year Produced 2019 
URL https://zenodo.org/record/3549711