Deep Learning for Astronomically Big Data

Lead Research Organisation: The University of Manchester

Department Name: Physics and Astronomy

Abstract

The soft-real time constraints on SKA image formation mean that the science data processor (SDP) responsible for creating image data products will require a processing power of ~0.5ExaFlops, producing approximately 1PB of data products per day. For astronomers, the data products will then be shipped around the world over a fibre network to the SKA regional data centres.

Imaging for radio interferometers relies on a processing model that inverts datasets from their native Fourier measurement basis to form an image. For the SKA these individual image cubes are very large (0.25PB on average) and will each contain tens to hundreds of thousands of different astronomical sources. For scientific exploitation, it will be necessary to automatically identify the objects in these data and classify them. Machine learning approaches for such an operation have started to be considered in astrophysics across a range of fields. For classification of previously identified objects with a range of measured and catalogued features, random forest classification is very popular; however in radio astronomy the use of convolutional neural networks has started to emerge as a potential mechanism for classifying objects directly within the image data in parallel with identification. Applications of this sort are still in their infancy in astrophysics and a consideration of how these methods will be applied to datasets with volumes equivalent those from the SKA is unclear. Scaling machine learning approaches to deal with SKA size image cubes will be a key big data challenge for SKA regional centres around the world.

In addition, the regional centres have a further potential advantage for advanced image analysis. Not bound by the real-time processing constraints of the SKA SDP, machine learning approaches that incorporate image formation within their processing model could provide significantly enhanced outputs. Fourier image formation by its nature enforces characteristics in the output image due to (e.g.) applied Fourier component weighting that will bias classification based on output image products alone. Incorporating the Fourier data directly into a deep learning approach would provide additional information that could improve classification.

For this project, data from the LOFAR telescope will be used as the closest available analogue to that from the SKA. The project will use deep data from the GOODS-N field obtained as part of the LOFAR Magnetism Key Science Project (MKSP). These data are a deep survey field and will provide a large volume of data on the same region of sky.

Student:

Fiona Porter

Period of Study:

Sep 18 - Dec 22

Funder:

COVID

Project Status:

Closed

Project Category:

Studentship

Project Reference:

2112505

Research Topic:

Unclassified

Organisations

People	ORCID iD
Anna Scaife (Primary Supervisor)
Fiona Porter (Student)

Publications

Author Name

Title Publication Date Published

10 25 50

Studentship Projects

Project Reference	Relationship	Related To	Start	End	Student Name
EP/S513842/1			30/09/2018	29/09/2024
2112505	Studentship	EP/S513842/1	30/09/2018	30/12/2022	Fiona Porter
NE/W503186/1			31/03/2021	30/03/2022
2112505	Studentship	NE/W503186/1	30/09/2018	30/12/2022	Fiona Porter

Abstract

Organisations

People

ORCID iD

Publications

Studentship Projects