ADEPT: Adaptive Dynamic Ensemble Prediction Techniques

Lead Research Organisation: University of Manchester
Department Name: Computer Science

Abstract

Predicting unknown quantities is a fundamental part of science andengineering. For example, in medicine one might wish to predict whether aperson has a cancerous tumour or not based on a scan; or in manufacturing, whether an industrial machine is producing faulty devices or not. The field of Artificial Intelligence has studied many techniques to produce good predictors. The last decade of research has seen the development of population-based techniques. Instead of using a single predictor, these build teams of predictors and combine the decisionsof the individuals through a voting or averaging process. Both theory andexperiments show this reliably improves upon using a single predictor --as they say two heads are better than one. A nice feature is that thesemethods are predictor-independent, meaning they can combine any kind ofpredictors (e.g. neural networks, decisions trees) into a team.This project aims to unify two sub-fields of Artificial Intelligence thatdeal with these population-based predictor-independent techniques:Ensemble Methods and Learning Classifier Systems.Ensemble Methods have produced some of the most powerful predictors ofthe last decade; the most well-known is called AdaBoost , and has beendubbed the best off-the-shelf predictor in the world (Professor LeoBreiman, University of California at Berkeley). These methods have beenwidely applied in many areas; however, one important area not yetinvestigated is multi-step problems. These are problems where decisionsin thepast and present can affect what the best decisions in the future willbe---for example choosing to play a certain opening strategy in chessmeans certain moves are less favourable later on in the game. Our mostdifficult multi-step problem will be optimising elevator scheduling tominimise the amount of time between pressing an elevator call button andthe arrival of the elevator. It is surprisingly difficult to optimise themovement of elevators in a large building. For one thing, a building with5 elevators and 30 floors has more possible configurations than thereare grains of sand on all the beaches in the world. Most ensemble methodscannot be directly applied to this kind of problem.Learning Classifier Systems are a class of nature-inspired algorithms, thatcan dynamically generate and adjust sets of predictors, and are capableof tackling these multi-step problems. Traditional ensemble methodshave not considered the multi-step domain, but have strong theoreticalfoundations to build upon. Learning Classifier Systems do not have sucha strong theory base, but have been intensely studied on multi-step problems.This project will create hybrid methods using theory and practice fromthese two quite disparate fields. We will advance the state-of-the-artin both fields and increase research capacity for tackling several problemclasses, focusing in particular on multi-step problems.

Publications

10 25 50
 
Description ADEPT (Adaptive Dynamic Ensemble Prediction Techniques) aimed to capitalize on the synergy between three seemingly diverse fields: evolutionary computation, ensemble learning, and probabilistic modelling. We developed a number of results which could never have been possible without all three influences. The result was a number of entirely new formalised models of learning, enabling new directions for the research community.

In more technical terms, the project's primary goal was to take a heuristic technique from the Evolutionary Computation literature - Learning Classifier Systems - and translate it into an ensemble-based probabilistic model. The probabilistic model we developed can precisely reproduce the capabilities of the LCS - an online supervised learning system, continuously adaptive, maintaining a parsimonious set of human-interpretable rules. However, the new model stands apart from the parameter-laden heuristic nature of LCS, having the advantages of a statistical underpinning: flexibility and a solid probabilistic foundation.
Exploitation Route The primary impact was on the research community - the results have been taken forward in other projects housed at Manchester and Bristol.
Sectors Digital/Communication/Information Technologies (including Software),Healthcare,Pharmaceuticals and Medical Biotechnology

URL http://www.cs.man.ac.uk/~gbrown/adept/
 
Description The main results have been implemented in a publicly available data mining framework called SPARK, and are being used by Oracle Research Labs, USA in a product.
First Year Of Impact 2012
Sector Digital/Communication/Information Technologies (including Software)
Impact Types Economic

 
Title FEAST 
Description FEAST provides efficient implementations of "feature selection" algorithms, as developed during the project. These algorithms are widely seen as the first step in any "big data" analysis, hence are very widely used. 
Type Of Technology Software 
Year Produced 2012 
Open Source License? Yes  
Impact Oracle Research Labs have adopted many of the techniques internally for their big data applications. 
URL http://mloss.org/software/view/386/