Automatic Creation of lung cancer pathology reports

Lead Research Organisation: University of Oxford

Abstract

Lung cancer is accountable for more death than any other type of cancer]. It is typically first picked up during a lung CT scan but usually requires a further visual examination of the extracted tissue under a microscope by a pathologist. The general type and the underlying morphological characteristics of lung cancer seriously affect clinical prognosis, so it is vital that they are accurately determined. The difficulty for making an accurate diagnosis lies in the inter- and intra-tumour heterogeneity. Furthermore, the trust in the diagnosis is undermined by the inter-and intra-observer variability among the pathologists. That is what makes the creation of a computer-aided diagnostics (CAD) tool so essential. An excellent example of such a tool for lung-CT images is the recently (March 2021) FDA-approved Optellium's "Virtual Nodule Clinic".My thesis project aims to achieve three primary goals:1) Automatic report generation. I plan to use artificial intelligence and machine learning to automate the creation of standardised pathology reports from lung-cancer microscopy images 2) Validation. Validate the report-generation tool developed under the above goal on data from the lung cancer screening programme by using a combination of expert reviews and NL 3) Identification of new radiology features. Improve the value of radiology imaging features of lung tumours extracted from CT and histology images.
To achieve these aims, I will develop novel methods in machine learning and computer vision. The main genres of the methods to be explored are as follows:1) Multi-level Attention. I plan to explore four image entities to pay attention to features extracted from image patches, spatial positions within the patches, patches themselves, and scale within the images. Using multiple magnifications was considered previously, but no attention mechanism was used to choose the magnification to focus on.2) Connecting Pathology and Radiology. Another primary goal is to improve the non-invasive path of lung cancer diagnostics. Mapping a histology slide onto the CT scan directly is a challenging task due to a large-scale difference between the two modalities. Hence, methods other than direct mapping need to be researched. After improving my histology models, I plan to investigate the connection between the features extracted from histology images and CT scans. Ideally, this will allow bypassing the invasive stage of making a diagnosis.
This project will make contributions to improving the diagnostics, and the treatment of patients with lung cancer will benefit from this DPhil project. The "Automatic Annotator" will be used to increase the accuracy and the consistency of diagnoses pathologists make from histology images, while the CT-Histology connection will hopefully reduce the number of unnecessary invasive procedures which are still needed now. The overall impact is, of course, to reduce patient mortality through the achievement of the above goals.
This project falls within the EPSRC Healthcare technologies research area.

Planned Impact

In the same way that bioinformatics has transformed genomic research and clinical practice, health data science will have a dramatic and lasting impact upon the broader fields of medical research, population health, and healthcare delivery. The beneficiaries of the proposed training programme, and of the research that it delivers and enables, will include academia, industry, healthcare, and the broader UK economy.

Academia: Graduates of the training programme will be well placed to start their post-doctoral careers in leading academic institutions, engaging in high-impact multi-disciplinary research, helping to build training and research capacity, sharing their experience within the wider academic community.

Industry: Partner organisations will benefit from close collaboration with leading researchers, from the joint exploration of research priorities, and from the commercialisation of arising intellectual property. Other organisations will benefit from the availability of highly-qualified graduates with skills in big health data analytics.

Healthcare: Healthcare organisations and patients will benefit from the results of enabled and accelerated health research, leading to new treatments and technologies, and an improved ability to identify and evaluate potential improvements in practice through the analysis of real-world health data.

Economy: The life sciences sector is a key component of the UK economy. The programme will provide partner companies with direct access to leading-edge research. Graduates of the programme will be well-qualified to contribute to economic growth - supporting health research and the development of new products and services - and will be able to inform policy and decision making at organisational, regional, and national levels.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S02428X/1 01/04/2019 30/09/2027
2432652 Studentship EP/S02428X/1 01/10/2020 30/09/2024 George Batchkala