Bandits and beyond: Index heuristics for dynamic resource allocation

Lead Research Organisation: Lancaster University
Department Name: Mathematics and Statistics

Abstract

The research programme is concerned with the development of simple, yet effective methods for determining how some key resource should be distributed over time among a collection of entities which require it. To fix some ideas, consider the example described in the following imagined scenario, referred to in the literature as the 'stochastic multiproduct batch dispatch problem' :Different types of products arrive at a holding station according to some random process. They are kept there until dispatched onward (to a retailer, say) by one of a fleet of vehicles. Costs are incurred by the items kept in storage at the station. The nature of these holding costs may differ markedly between product types. Products may also differ with regard to their space requirements for transportation. At the beginning of each time period (the start of each day, say), a decision has to be made regarding how many vehicles should be dispatched and what the composition of their loads should be. Such decisions may be based in part, say, on the number of units of each product type at the station at the time. The goal is to take such decisions in a way that minimises the overall costs incurred in holding the products and in dispatching the vehicles. To use some mathematical jargon, the decision-maker is looking for a dynamic policy for the allocation of her key resource (ie, a rule for the deployment of the fleet of vehicles) among a set of entities (here the product types) in a situation which is both stochastic (evolves randomly) and complex. How to develop such a policy is extremely difficult, not least because decisions must be assessed not only in terms of their immediate impact on costs but also with regard to their influence on the costs which will be incurred in the future. The conventional approach to such problems, called stochastic dynamic programming, is unlikely to be able to cope well with problems of realistic size . What would be invaluable to the decision-maker would be some effective, yet reasonably simple and computationally tractable, way of calibrating the value to the respective product types of (differing amounts of) space on the vehicles. Such calibrations could then be used to inform decision-making. Simple so-called index-based solutions do indeed exist (and are very effective), but mainly for (bandit-type) problems which impose serious limitations on how the key resource may be distributed among the entities. The goal of the research project is to extend the scope of such simple index-based solution approaches to more general allocation problems which are freed of such restrictions . There is a rich variety of practical situations which share the broad features of the above example. The methodologies developed during the research programme will yield simple and effective index-based approaches to resource allocation in many such settings.
 
Description We have discovered that the notion of indexing, namely the evaluation of decision options by giving each one a simple score, is powerful in a much wider range of scenarios that was thought.
Exploitation Route Development of the theory for application in, say, queueing control and asset management.
Sectors Other