Healthcare AI for Infectious Disease

Lead Research Organisation: University of Oxford

Abstract

Healthcare is one of the fastest-growing sectors in AI, which has seen extraordinary growth within the past 12 months - from start-ups, to multinationals and across the major AI conferences. This reflects the fact that systems world-wide are struggling to cope with the demands of ever-increasing populations in the 21st-century, where the effects of increased life expectancy and the demands of modern lifestyles have created an unsustainable social and financial burden. These urgent needs have been made even more clear to society through the recent COVID-19 pandemic.AI-based approaches promise the change required to meet these challenges: ever-increasing quantities of complex, massively-multivariate data concerning all aspects of patient care are being routinely acquired and stored, throughout the life of a patient.The great majority of this research worldwide involves AI for medical imaging. However, much of the healthcare data that are acquired in practice are non-imaging, often highly complex, time-series data, including electronic health records (EHRs) that are now active as standard in many hospitals.Large volumes of time-series data also arise from the availability of patient-worn sensors (often including "wearables"), in addition to increasing amounts of massively-multivariate genomic data. Such datasets, while extremely large and rich are extremely difficult to obtain for research purposes, with a resulting very high "barrier to entry" to AI researchers focused on non-imaging methodology.Consequently, research in this field to date occurs at a small number of sites globally, including multinationals such as Google and Apple (which typically have a narrow methodological focus). A fraction of this research is published, where most studies in the literature consequently used those few datasets that are publicly-available (e.g. Physionet / MIMIC from MIT), which represents a sub-discipline of healthcare (typically, intensive care units) from a single US site.Healthcare datasets are rarely available for researchers outside a select few sites worldwide. In keeping with the "open research" ethos of the lab, a theme of the doctoral research will focus on making available synthetic datasets that are sufficiently close to "real world" clinical datasets as can be realistically achieved. Generative adversarial networks have proven effective in imaging for creating believable synthetic datasets, but typically such approaches do not scale to the complex, massively-multivariate setting of non-imaging healthcare datasets. Within the constraints of ethics and governance processes, with which we have substantial experience, we will seek to make available to the nascent Health AI community the results of new classes of complex generative methods - and their synthetic benchmark datasets to stimulate the field and democratise access. Proof-of-principle studies on small healthcare AI datasets have recently shown that predictive performance can be increased significantly through the use of such synthetic datasets (derived from massive, otherwise-unavailable clinical datasets).This theme will exploit the models constructed in Health AI to find generative mechanisms that permit realistic healthcare data to be made available.This work is particularly motivated by Public Health England / the National Institute of Health Protection, which recently asked us to undertake this work, such that they can make available for research synthetic datasets based on their massive UK-surveillance data (which otherwise cannot be shared).Healthcare datasets are rarely available for researchers outside a select few sites worldwide. In keeping with the "open research" ethos of the lab, a theme of the doctoral research will focus on making available synthetic datasets that are sufficiently close to "real world" clinical datasets as can be realistically achieved. This project falls within the EPSRC Digital Economies, Healthcare Technologies, and ICT research area

Planned Impact

In the same way that bioinformatics has transformed genomic research and clinical practice, health data science will have a dramatic and lasting impact upon the broader fields of medical research, population health, and healthcare delivery. The beneficiaries of the proposed training programme, and of the research that it delivers and enables, will include academia, industry, healthcare, and the broader UK economy.

Academia: Graduates of the training programme will be well placed to start their post-doctoral careers in leading academic institutions, engaging in high-impact multi-disciplinary research, helping to build training and research capacity, sharing their experience within the wider academic community.

Industry: Partner organisations will benefit from close collaboration with leading researchers, from the joint exploration of research priorities, and from the commercialisation of arising intellectual property. Other organisations will benefit from the availability of highly-qualified graduates with skills in big health data analytics.

Healthcare: Healthcare organisations and patients will benefit from the results of enabled and accelerated health research, leading to new treatments and technologies, and an improved ability to identify and evaluate potential improvements in practice through the analysis of real-world health data.

Economy: The life sciences sector is a key component of the UK economy. The programme will provide partner companies with direct access to leading-edge research. Graduates of the programme will be well-qualified to contribute to economic growth - supporting health research and the development of new products and services - and will be able to inform policy and decision making at organisational, regional, and national levels.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S02428X/1 01/04/2019 30/09/2027
2279748 Studentship EP/S02428X/1 01/10/2019 31/12/2023 Odhran O'Donoghue