Mixed Precision FFTs

Lead Research Organisation: University of Oxford
Department Name: Engineering Science

Abstract

This research is a continuation of my undergraduate masters research project of the same title. The Fast Fourier Transform (FFT) is an extremely widely used algorithm in many areas of scientific computing. As new generations of computing hardware implement and enable the use of novel datatypes (for example bfloat16), it is important to understand the impact that using these new modes has on algorithms, of which the FFT is a very important example. So far, the project has involved implementing bfloat16 FFTs ina new version of an existing scientific code, Astro-Accelerate, whichis a GPU (graphics processing unit) accelerated code for studying radio astronomy datasets. In Astro-Accelerate, I managed to reduce the time spent doing FFTs in the search of a given file by 50%, scaling this saving up to the size of supercomputer that would be required inthe data processor of next generation telescopes, such as the SKA(Square Kilometre Array), this would translate to electricitysavings worth millions of pounds.It could also be used to reduce the upfront hardware requirement by tens of millions of pounds.The further aims and objectives of the project are to continue investigating the impact of mixed precision FFTs in wider, more general applications.Also it will be important to document the process of determining whether mixed precision FFTs are useful in a given context.This will involvepotentially developing novel implementations of the FFT to use features (including but not limited to mixed precision) on newly available hardware.The intended impact of the project is to allowresearchers using the FFT, and others considering using mixed precision to understand the impact that changing their code might have, without having to do it themselves. Additionally, the findings of this project may help to guide the next generations of computing hardware, as wemay find either particularly effective/ineffective approaches that could justify hardware changes.Another direction the project could take is using techniques developed in the field of numerical analysis to study the effect of mixed precision on generic FFTs when used to perform convolutions. Convolutions are widely used in the rapidly changing field ofmachine learning, and interestingly, mixed precision implementations are advertised as drop in replacements of existing single precision codes.In some cases they not only runfaster on a computer, but also producemore robust neural networks (due to regularisation). This demonstrates that it is extremely important to understand how to quantify an appropriate amount of numerical precision for a given application, and not just assume that more precision is better.So far, the research has involved a mixture of low level coding, using CUDA C/C++ to programand optimisethe GPUcode, as well as high level Python + MATLAB scripts to visualise and processthe output of the software written in C.This project falls within the EPSRC Digital Signal Processing research area.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/T517811/1 01/10/2020 30/09/2025
2595728 Studentship EP/T517811/1 01/10/2021 31/03/2025