Data Visualisation: Learning From Big Data

Lead Research Organisation: University of Cambridge
Department Name: Engineering

Abstract

The aim of this research project entitled "Data Visualisation: Learning From Big Data" is to answer the question: "how can engineers contend with the current challenges of Big Data, particularly those concerning data volume, velocity and high-dimensionality, so they may more effectively extract meaningful insights from their simulations and experiments?". At present, one of the many difficulties faced by research engineers is that of storing, sending, recovering and viewing extreme volumes of simulation data in productive or cost-effective time frames. This issue, often referred to in the literature as the "I/O bottleneck", is only going to be exacerbated by continued advances in compute capability rooted in the maturation of hardware accelerated high-performance computing. For example, in fields of study such as Turbomachinery, cutting edge Direct Numerical Simulations (DNS) now yield single-snapshot outputs each demanding tens of GigaBytes of storage space. With average compute speeds now exceeding current best data transfer rates by nearly an order of magnitude, it is becoming almost impossible to inspect and manipulate such large volumes of data with current techniques and infrastructure. As such, new approaches must be considered.

The initial aims of this project are to investigate, develop, and successfully implement software solutions to the I/O bottleneck problem, focussing evenly on massive ensemble dataset and large singular data volume applications. The preeminent approach is to implement compression as a means of representing large multidimensional volumes of simulation or experimental data in more compact forms. Neural implicit representations are to be explored as an avenue for achieving extreme data compression without significant loss of information. A practical target is to visualise and ultimately interact with DNS simulation data in real-time within a web-based viewer: a task that is not easily accomplished with current capabilities. Work will also explore training schemes for neural representations, as well as aspects of the decompression process. Additional research aims are to investigate and apply statistical tools, data science methods, and machine learning (ML) methods to assist in automatic insight generation. The goal of this work will be to lower the demand for both domain-specific knowledge and proficiency in statistics when performing exploratory data analysis (EDA). Finally, stretch goals are to also explore new and disruptive technologies for application in interactive data visualisation.

Planned Impact

1. Impact on the UK Aero-Propulsion and Power Generation Industry
The UK Propulsion and Power sector is undergoing disruptive change. Electrification is allowing a new generation of Urban Air Vehicles to be developed, with over 70 active programmes planning a first flight by 2024. In the middle of the aircraft market, companies like Airbus and Rolls-Royce, are developing boundary layer ingestion propulsion systems. At high speed, Reaction Engines Ltd are developing complex new air breathing engines. In the aero gas turbine sector Rolls-Royce is developing UltraFan, its first new architecture since the 1970s. In the turbocharger markets UK companies such as Cummins and Napier are developing advanced turbochargers for use in compounded engines with electrical drive trains. In the power generation sector, Mitsubishi Heavy Industries and Siemens are developing new gas turbines which have the capability for rapid start up to enable increased supply from renewables. In the domestic turbomachinery market, Dyson are developing a whole new range of miniature high speed compressors. All of these challenges require a new generation of engineers to be trained. These engineers will need a combination of the traditional Aero-thermal skills, and new Data Science and Systems Integration skills. The Centre has been specifically designed to meet this challenge.

Over the next 20 years, Rolls-Royce estimates that the global market opportunities in the gas turbine-related aftercare services will be worth over US$700 billion. Gas turbines will have 'Digital Twins' which are continually updated using engine health data. To ensure that the UK leads this field it is important that a new generation of engineer is trained in both the underpinning Aero-thermal knowledge and in new Data Science techniques. The Centre will provide this training by linking the University and Industry Partners with the Alan Turing Institute, and with industrial data labs such as R2 Data Labs at Rolls-Royce and the 'MindSphere' centres at Siemens.

2. Impact on UK Propulsion and Power Research Landscape
The three partner institutions (Cambridge, Oxford and Loughborough) are closely linked to the broader UK Propulsion and Power community. This is through collaborations with universities such as Imperial, Cranfield, Southampton, Bath, Surrey and Sussex. This will allow the research knowledge developed in the Centre to benefit the whole of the UK Propulsion and Power research community.

The Centre will also have impact on the Data Science research community through links with the CDT in Data Centric Engineering (DCE) at Imperial College and with the Alan Turning Institute. This will allow cross-fertilization of ideas related to data science and the use of advanced data analytics in the Propulsion and Power sectors.

3. Impact of training a new generation of engineering students
The cohort-based training programme of the current CDT in Gas Turbine Aerodynamics has proved highly successful. The Centre's independent Advisory Group has noted that the multi-institution, multi-disciplinary nature of the Centre is unique within the global gas turbine training community, and the feedback from cohorts of current students has been extremely positive (92% satisfaction rating in the 2015 PRES survey). The new CDT in Future Propulsion and Power will combine the core underlying Aero-thermal knowledge of the previous CDT with the Data Science and Systems Integration skills required to meet the challenges of the next generation. This will provide the UK with a unique cohort of at least 90 students trained both to understand the real aero-thermal problems and to have the Data Science and Systems Integration skills necessary to solve the challenges of the future.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S023003/1 01/10/2019 31/03/2028
2640731 Studentship EP/S023003/1 01/10/2021 30/09/2025 Robert Sales