Algorithms for Computing with Uncertainty: Theory and Experiments

Lead Research Organisation: University of Leicester
Department Name: Computer Science

Abstract

How much information should we collect before making a decision? This question underlies the research area of computing with explorable uncertainty. For example, assume that we want to build a network connecting a set of branch offices. For any two locations A and B of branch offices, we have an estimate of the cost for building a link between A and B based on the distance between them. The exact cost of building a link between A and B can be determined by further investigations, but these investigations take time and cost money. If we knew the exact link cost for every pair of locations, we could determine the cheapest way of building the network using a known algorithm for the "minimum spanning tree" problem. The approach of first determining for all pairs of locations the exact cost of building a link between them is not efficient, however: It will take a long time to determine all the exact link costs, and the costs for obtaining that information will be significant. It is therefore desirable to find efficient methods for selecting in a clever way the pairs of locations for which the link costs need to be determined, while still achieving the goal of being able to build a cheap network with the information gained. Algorithms for computing with explorable uncertainty solve such problems: They specify a strategy for selecting the pairs of locations for which exact information should be determined until sufficient information has been gained to determine the best possible network to be built. More generally, computing with explorable uncertainty deals with problems where part of the input is uncertain (known only approximately) but can be obtained at a cost using a query operation.

Previous work on computing with uncertainty has focused on the setting where queries are made one by one sequentially (which may take a long time) and where the goal is to make as few queries as possible while ensuring that sufficient information is obtained to solve the problem optimally. This leaves open the question of how the queries should be selected if a number of queries can be made at the same time in parallel, which is realistic in many applications (for example, in the application outlines above, the exact costs of building links between several pairs of branch office locations could be determined in parallel). Another direction that has not yet been sufficiently considered is the setting where the goal is to optimize a combination of the query cost and the cost of the solution determine in the end. The project aims to take research in computing with explorable uncertainty to the next level by addressing these open questions and developing new algorithms that work provably well in the described scenarios. Methods developed in the project can potentially be useful to any decision-making scenarios where additional information about the input data of a problem is available in principle and can be obtained at a cost.

Planned Impact

The project is concerned with foundational research that aims to find new algorithmic solutions to the problem of deciding for which parts of an uncertain input additional information should be requested (via queries) before a solution can be computed. Therefore, we expect the direct impact of the project to be mostly academic, but there is potential for wider impact on industry and society in the longer term.

One impact of the project will be the training of a PDRA as well as a number of Master and PhD students. These early career researchers will be equipped with a unique combination of expertise in modelling and designing algorithms for computing with explorable uncertainty that they can bring to the organisations where they continue their careers.

To enable impact of the algorithms designed in the project, all implementations of algorithms created in the project for the experimental evaluation of the newly designed algorithms, as well as the data sets used for their evaluation, will be made freely available as open source via a code repository site such as github, thus removing barriers for other researchers from academia or industry who want to use or build on the algorithms designed in the project.

We will also maintain a project website that describes the research conducted in the project together with links to the project's publications and to source code of algorithm implementations created in the project. The website will also contain explanations of the project topics and results written in such a way so as to be accessible to a general audience, as a means towards public engagement. We will offer lectures related to the research done in the project at Open Days or Schools' Taster Conferences on campus or during school visits. Furthermore, we will offer an (unpaid) summer placement to a high school school student on a topic related to the project each year.

Finally, we briefly sketch some application areas where research done in the project could potentially lead to long-term impact. The problem of having to deal with situations where there is uncertainty about some parts of the input to a computational problem or a decision-making problem arises in numerous real-world situations. In many such scenarios, it is possible to acquire additional information about the uncertain parts of the input in principle, but it is infeasible or too costly to obtain precise information about the whole input. For example, the Internet of Things will make it possible in principle to collect data from millions or billions of devices, but it will be necessary to restrict the data collection to a limited number of data sources due to limitations in network capacity and processing capability. Deciding which data to collect in order to solve a given task can be seen as the problem of making queries to replace uncertain information with exact values. Another example is given by drone search and rescue operations, where uncertainty about a missing climber or hill walker can be reduced using drones that use cameras and other sensors to search an area. The algorithms that will be developed in the project can provide the basis for such tasks in real-world application contexts with the benefit of reducing the resources required for obtaining additional information before making decisions. Throughout the project duration and afterwards, we will seek contact with industry representatives that attend conferences or AlgoUK workshops to discuss applications of our research as well as industrial problems that can inform our research directions. We will also use direct contacts to industry (via our departmental contacts as well as the University of Leicester's Innovation Hub) for this purpose. When suitable opportunities are identified, we will determine which form of knowledge transfer activity is most appropriate, for example, we may apply for knowledge transfer partnerships (KTPs).
 
Description The project investigates settings where some input values are only given as uncertainty intervals that are guaranteed to contain the precise value, and queries can be made to obtain the precise values. The goal is to minimise the cost for queries while ensuring that sufficient precise information is obtained to solve the given problem. The key findings of the first half of the project (before the transfer to Durham University) are:

One of the main questions was how to deal with settings where a number of queries can be made in parallel. The project obtained the first results for this model, presenting algorithms that guarantee a close-to-optimal number of query rounds for the sorting problem, for the problem of finding i-th smallest values, and for the problem of finding minima of overlapping sets. The findings provide new techniques for selecting effective sets of queries that can be asked in parallel.

For a setting where queries are made one by one but each precise value is drawn from a known probability distribution over the uncertainty interval, the project has developed algorithms whose expected query cost is provably close to the expected cost of the optimal query set, again for the problem of finding minima of overlapping sets. The findings also show that the classical vertex cover problem pays a crucial role in this setting.

Furthermore, the project studied a scheduling problem where the processing time of a job can be reduced by an uncertain amount if an optional test operation (that takes unit time) is applied to the job, This is motivated by real-world scenarios with code optimization or file compression. The findings are deterministic and randomised algorithms that produce schedules with the property that the average completion time of the jobs is provably close to the best possible schedule.

The key findings for the whole project (including the findings obtained after the transfer to Durham University) will be reported in the report for EP/S033483/2.
Exploitation Route The insights on good query selection strategies for problems with input uncertainty could potentially be used in software systems that support decision making under uncertainty in application settings where it is possible to obtain precise information about each uncertain input value at a cost. This might help reduce the number of costly query operations needed in such settings.
Sectors Digital/Communication/Information Technologies (including Software)

URL https://sites.google.com/view/thomas-erlebach/home/acute
 
Description Bremen and Paris (Project collaborators) 
Organisation Sorbonne Universités
Country France 
Sector Academic/University 
PI Contribution Joint research on computing and scheduling with uncertainty.
Collaborator Contribution Joint research on computing and scheduling with uncertainty.
Impact Christoph Dürr, Thomas Erlebach, Nicole Megow, Julie Meißner: An Adversarial Model for Scheduling with Testing. Algorithmica 82(12): 3630-3675 (2020) Evripidis Bampis, Christoph Dürr, Thomas Erlebach, Murilo Santos de Lima, Nicole Megow, Jens Schlöter: Orienting (Hyper)graphs Under Explorable Stochastic Uncertainty. ESA 2021: 10:1-10:18 Thomas Erlebach, Murilo Santos de Lima, Nicole Megow, Jens Schlöter: Learning-Augmented Query Policies for Minimum Spanning Tree with Uncertainty. ESA 2022: 49:1-49:18
Start Year 2016
 
Description Bremen and Paris (Project collaborators) 
Organisation University of Bremen
Country Germany 
Sector Academic/University 
PI Contribution Joint research on computing and scheduling with uncertainty.
Collaborator Contribution Joint research on computing and scheduling with uncertainty.
Impact Christoph Dürr, Thomas Erlebach, Nicole Megow, Julie Meißner: An Adversarial Model for Scheduling with Testing. Algorithmica 82(12): 3630-3675 (2020) Evripidis Bampis, Christoph Dürr, Thomas Erlebach, Murilo Santos de Lima, Nicole Megow, Jens Schlöter: Orienting (Hyper)graphs Under Explorable Stochastic Uncertainty. ESA 2021: 10:1-10:18 Thomas Erlebach, Murilo Santos de Lima, Nicole Megow, Jens Schlöter: Learning-Augmented Query Policies for Minimum Spanning Tree with Uncertainty. ESA 2022: 49:1-49:18
Start Year 2016