Big-Data Compressive Sensing: Fast, Parallelised and Distributed Algorithms

Lead Research Organisation: University College London
Department Name: Mullard Space Science Laboratory

Abstract

The emerging era of big-data will provide both challenges and opportunities. If the challenge of handling big-data and extracting meaningful information from it can be met, then the wealth of information teased out of such data-sets will be highly informative, revolutionising numerous academic fields and industries. An effective means to tease meaningful information from data is by posing and solving inverse problems, which are a large and important class of mathematical problem experienced in a broad range of academic and industrial domains. Compressive sensing is a recent breakthrough in information theory that has the potential to revolutionise the acquisition and analysis of data in many fields, providing a promising route to addressing the big-data challenge by solving inverse problems associated with high data under-sampling via sparse regularisation. Although such an approach provides a rigorous theoretical framework to solve inverse problems, this must be complemented by fast algorithms with efficient implementations. Many research codes written in Matlab and Python exist to solve these problems, however, a professional software package that is parallelised is lacking. We will fill this void by developing SOPT++, a public open-source software package for solving inverse problems using sparse regularisation techniques, exploiting theoretical developments from compressive sensing. SOPT++ will implement novel highly parallelised and distributed convex optimisation algorithms for big-data. The structure of our convex optimisation algorithms will not only allow computations to be distributed across multi-node architectures, but memory and storage requirements also. Moreover, common measurement and sparsifying operators that appear in descriptions of inverse problems, which are applied repeatedly when finding a solution, will be highly parallelised on many-core architectures, such as GPGPU and Xeon Phi co-processors, through vectorisation or light-weight threads. This tiered parallelisation will allow SOPT++ to be deployed across the full range of modern high performance computing systems. SOPT++ will be designed carefully from both algorithmic and implementation perspectives. The former will ensure a variety of sparse regularisation problem formulations can be considered, while the latter will ensure that SOPT++ can be applied seamlessly to different domains of application. It is anticipated that SOPT++ will be applied to solve inverse problems in a wide range of fields, including magnetic resonance imaging, computed tomography, seismic imaging, computer vision, machine learning, radio interferometry, and cosmology, to name just a few, allowing researchers to scale their analyses up to big data-sets. Wide uptake of SOPT++ will be facilitated by providing a well designed professional software platform that is well documented and contains numerous tutorials and examples. In addition, we will apply SOPT++ to diffusion magnetic resonance imaging, a central modality for neuroscience. Clinical application of high angular resolution diffusion MRI (HARDI) requires fast acquisition sequences. We will leverage the regularisation power of SOPT++ algorithms to enable HARDI from highly under-sampled data, where under-sampling is key to sequence acceleration.

Planned Impact

Our research programme targets areas designated by EPSRC as priorities for growth due to their strategic economic and societal benefits to the UK. Specifically, we focus on digital signal processing, which has implications for the digital economy. In the era of big-data, teasing useful information out of big data-sets is of paramount importance in many industries. We address precisely this challenge.

An immediate societal impact of our research will be in the area of diffusion magnetic resonance imaging (MRI). MRI is a non-invasive and non-ionising biomedical imaging technique that finds its superiority in both the flexibility of the data acquisition process and the multiplicity of its contrast mechanisms originating in the underlying physical tissue parameters. It is used in multiple modalities with the aim of imaging the human body, in particular our brain and heart. Brain MRI is a powerful technique in neuroscience, in particular for understanding brain changes during development, ageing, and disease. It also provides a unique tool at the clinical level for diagnosis and management of brain diseases, from neurodegenerative diseases such as the Alzheimer disease, to cerebrovascular accidents (strokes), inflammatory diseases such as multiple sclerosis, or tumours. Diffusion MRI probes the intensity and direction of molecular diffusion in each voxel of the brain to map the structural neuronal connectivity. However, high angular resolution diffusion MRI (HARDI) is currently simply too time-consuming for clinical application. SOPT++ will enable HARDI from highly under-sampled data, where under-sampling is key to sequence acceleration. In the long term, HARDI can contribute to neuroscience by enhancing our understanding of the structural brain connectivity. Opening HARDI up to clinical use will provide new diagnostic methodologies with better specificity and sensitivity, enabling early diagnosis of neurological disorders such as Alzheimer's disease, affording better healthcare at lower financial cost.

Our software will also be directly applicable in radio interferometry, where imaging also involves an inverse problem that can be solved by SOPT++. The Square Kilometre Array (SKA) is a new radio interferometric telescope currently under design, with broad-ranging science goals from uncovering the mysteries of dark energy and dark matter, to the study of extra-terrestrial life. However, the thousands of individual telescopes making up the SKA will produce a tremendous big-data challenge: the anticipated data-rate of the SKA is expected to be many times greater than current world-wide internet traffic. Radio interferometric image reconstruction techniques need urgent attention if they are to achieve the science goals of SKA, which are of important societal interest. The convex optimisation approach of SOPT++ may well represent a keystone in that context.

There are also likely to be commercial and industrial applications of SOPT++. In particular, we will explore the use of SOPT++ in computer graphics and vision, industries which contribute well over a billion pounds to the UK economy. Acting as a consultant to Geomerics Ltd (http://www.geomerics.com), McEwen developed novel fast rendering algorithms based on wavelet methods. We will investigate the use of SOPT++ for inverse rendering, patenting any technology that could be exploited commercially.

In developing techniques to tease meaningful information out of big data-sets, we will learn many lessons and new skills related to the analysis of big-data. The UK hosts a burgeoning information economy, where big-data challenges are playing a major role in shaping new industries and revolutionising current ones. We will share with industry our experience in tackling the big-data challenge. Our research will therefore help to develop expertise in big-data, which will have positive spin-off implications for the wider UK economy.

Publications

10 25 50
 
Description See publications

Public software codes for solving sparse regularisation problems have been released (see Software products)
Exploitation Route The software codes are useful for solving problems in a variety of fields. They have already been applied to image data from radio interferometric telescopes, showing a considerable improvement over the existing state-of-the-art. In addition, improvement and extensions to optimal sampling methods on the sphere have also been developed.
Sectors Digital/Communication/Information Technologies (including Software),Other

 
Title PURIFY 
Description Next-generation radio interferometric imaging 
Type Of Technology Software 
Year Produced 2016 
Open Source License? Yes  
Impact Provide software to perform next-generation radio interferometric imaging 
URL http://basp-group.github.io/purify/
 
Title S2LET 
Description Fast wavelets on the sphere 
Type Of Technology Software 
Year Produced 2016 
Open Source License? Yes  
Impact Functionality to perform wavelet analysis on the sphere useful for numerous data analysis problems Applied to remove foreground contamination from Planck observations of the cosmic microwave background (CMB) 
URL http://astro-informatics.github.io/s2let/
 
Title SO3: Fast Wigner transforms on the rotation group 
Description The SO3 code provides functionality to perform fast and exact Wigner (Fourier) transforms based on the sampling theorem on the rotation group SO(3) derived in our related article. 
Type Of Technology Software 
Year Produced 2015 
Open Source License? Yes  
Impact The SO3 code is generally applicable whenever Fourier transforms must be computed on SO(3) and will be integral component of new fast wavelet transforms on the sphere. 
URL http://astro-informatics.github.io/so3/
 
Title SOPT 
Description Sparse optimisation 
Type Of Technology Software 
Year Produced 2016 
Open Source License? Yes  
Impact Toolbox for solving sparse optimisation problems for variety of applications 
URL http://basp-group.github.io/sopt/
 
Title SSHT 
Description Spin spherical harmonic transforms 
Type Of Technology Software 
Year Produced 2017 
Open Source License? Yes  
Impact General tool to perform spin spherical harmonic transforms useful for many applications 
URL http://astro-informatics.github.io/ssht/