Machine Learning for Thread Level Speculation on Multicore Architectures

Lead Research Organisation: The University of Manchester

Department Name: Computer Science

Abstract

Computer hardware has arrived in the era of multi-core systems. Processors with 2 and 4 cores are already in the high-street. Chip manufacturers promise to deliver many more cores per chip in the coming years. The big research challenge is: how can we make best use of all these resources? Existing programs and programming styles are unable to take real advantage of this hardware concurrency. Thread-Level speculation is one viable solution. TLS works by making predictions about future computations, proceeding to execute programs `speculatively' as if these predictions were true. As a backup, it checks the predictions, in parallel with the speculative computation. If the predictions turn out to be correct, then the computer has done useful work earlier than it could have done otherwise - ultimately meaning your programs run faster. On the other hand, if the predictions are false, then the system has to throw away results, and the speculative work is wasted.There are many different factors to consider in this new paradigm. TLS influences different parts of the system, including processor, memory, operating system, programming language and compiler. At each of these different levels, there are various policies and heuristics to set. These affect things like how make predictions about the future, how to stop different computational tasks from interfering with each other, how to decide which threads are more important, and how existing optimization techniques interact with speculation. This research project will explore these factors using Machine Learning. We will use state of the art feature selection and online machine learning techniques, developing the field where necessary, with the ultimate goal of creating a computer system that can automatically tune itself to run its programs as fast as the physical resources will allow.

Funded Value:

£415,164

Funded Period:

Sep 08 - Mar 12

Funder:

EPSRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

EP/G000662/1

Principal Investigator:

Gavin Brown

Research Subject:

Info. & commun. Technol. (100%)

Research Topic:

Artificial Intelligence (30%)

Parallel Computing (30%)

System on Chip (40%)

Organisations

The University of Manchester (Lead Research Organisation)

People	ORCID iD
Gavin Brown (Principal Investigator)	http://orcid.org/0000-0003-2261-9018
Ian Watson (Co-Investigator)
Mikel Lujan (Co-Investigator)

Publications

Author Name

Title Publication Date Published

|< < 1 2 > >|

10 25 50

Brown G (2012) Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection in Journal of Machine Learning Research

Brown G (2009) A New Perspective for Information Theoretic Feature Selection

Brown Gavin (2012) Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection in JOURNAL OF MACHINE LEARNING RESEARCH

Ioannou N (2010) Toward a more accurate understanding of the limits of the TLS execution paradigm

Pocock A (2010) Online Nonstationary Boosting

Pocock Adam Craig (2012) Feature selection via joint likelihood

Sechidis K (2017) Dealing with under-reported variables: An information theoretic solution in International Journal of Approximate Reasoning

Singer J (2010) The economics of garbage collection

Singer J (2011) Garbage collection auto-tuning for Java mapreduce on multi-cores

Singer J (2010) Fundamental Nano-Patterns to Characterize and Classify Java Methods in Electronic Notes in Theoretical Computer Science

Key Findings
Impact Summary
Software and Technical Products


Description	Computer hardware has arrived in the era of "multicore" systems. Processors with 2 and 4 cores are already in the high-street. Chip manufacturers promise to deliver many more cores per chip in the coming years. The big research challenge is: how can we make best use of all these resources? This project is exploiting Machine Learning technologies and perspectives to address the challenge. During the project we improved the state of the art for TLS both at the hardware and software level, with novel techniques for Java and C based programming languages. The research probed the limits of what could be achieved using TLS and the incorporation of 'idealised' machine learning techniques. This enables us to direct effort accordingly, and guide our resource allocations. In concert with this contribution to the systems community, we also developed fundamentally novel research for the ML community. We developed state of the art "feature selection" techniques, tuned for the parallel computing domain. As a result we have contributed a novel unifying framework to the machine learning community, explaining 20 years of research under a single paradigm.
Exploitation Route	Several, most notably in industrial compiler and microprocessor design.
Sectors	Digital/Communication/Information Technologies (including Software)
URL	http://www.cs.man.ac.uk/~gbrown/itls/


Description	We developed a unifying mathematical framework for information theoretic feature selection, bringing almost two decades of research on heuristic filter criteria under a single theoretical interpretation. This is in response to the question: "what are the implicit statistical assumptions of feature selection criteria based on mutual information?". This unified over 25 years of work in the area - the journal paper (JMLR) associated with the work has been cited 110 times in 2 years. The software FEAST, published as part of this project, has been downloaded over 3500 times. This work also described a novel compact version management data structure optimized for space overhead when using a small number of TLS threads. Furthermore, we describe two novel software runtime parallelization systems that utilize this compact data structure. The latest journal paper (TACO) has been cited 9 times in less than 1 year.
First Year Of Impact	2012
Sector	Digital/Communication/Information Technologies (including Software),Healthcare
Impact Types	Societal Economic


Title	FEAST
Description	FEAST provides efficient implementations of "feature selection" algorithms, as developed during the project. These algorithms are widely seen as the first step in any "big data" analysis, hence are very widely used.
Type Of Technology	Software
Year Produced	2012
Open Source License?	Yes
Impact	Oracle Research Labs have adopted many of the techniques internally for their big data applications.
URL	http://mloss.org/software/view/386/

Abstract

Organisations

People

ORCID iD

Publications