TeraVM: Harnessing TBs of memory through managed runtime environments

Lead Research Organisation: University of Manchester
Department Name: Computer Science

Abstract

The ever-expanding volume of information and data has driven the recent growth of Big Data and Cloud applications. Scale-up and scale-out hardware solutions have been developed to satisfy the ever-increasing demands of those applications by bearing resources in new orders of magnitude (eg TBs of Memory). Novel solutions like NumaScale's NumaConnect chip enable the flexible and scalable aggregation of hundreds of processors and TBs of memory resources forming a single system image by providing cache coherency and shared memory.

Managed Runtime Environments (MREs) host languages, such as Java and Scala, which are popular for Big Data and Cloud applications not only for their performance due to JIT compilation but also for their high-level programability and robustness, offered by automatic memory management (Garbage Collection - GC). However, Java-based cloud and Big Data applications that run on MREs can not directly benefit from these new architectures due to inefficiencies in the GC. The Garbage Collector's performance and scalability, which are decisive for a Big Data application's overall performance, often can become bottlenecks.

To efficiently handle such a large amount of processors and memory resources, significant research and engineering effort must be invested at the runtime layer of these languages, in our case the Java Virtual Machine (JVM). Although both the JVMs and their accompanying GC algorithms have been heavily optimized throughout the years, they have never been researched in the context of such shared-memory aggregated architectures that utilize TBs of memory. Optimizations like efficient use of data locality, NUMA-aware object allocation and GC, optimized thread scheduling and placement have been researched and developed in smaller or different hardware architectures. Hence, architectures such as NumaScale create new research opportunities in the field of JVMs and, therefore, on GC scalability.

The key objectives of this PhD project are to:
1) Evaluate state-of-the-art GCs on aggregated servers with TBs of Memory (like NumaScale).
2) Investigate the corresponding GC scalability degradation factors.
3) Prototype novel techniques and GC optimizations towards harnessing TBs of memory through MREs.

Our approach involves:
1) The implementation of the research platform that will be used to conduct our research. This platform consists of:
a) The Maxine-VM, a metacircular research Java Virtual Machine.
b) The Memory Management Toolkit (MMTk), a framework which provides all necessary GC building blocks in a wide GC algorithm range. We aim to port it to MaxineVM since it is currently part of Jikes RVM.
c) The Allocation Profiler, a NUMA-aware memory profiler to evaluate memory allocation and management insights. Although this profiler will be implemented from scratch, cut and tailored for our own needs, it will introduce profiling with NUMA-aware metrics.
2) A study/evaluation on Memory and GC behaviour for large-scale NUMA architectures (hundreds of cores and TBs of memory).
3) Novel NUMA-aware GC optimizations.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/N509565/1 01/10/2016 30/09/2021
2287629 Studentship EP/N509565/1 01/07/2017 30/06/2020 Orion Papadakis