Enhancing the diagnostic performance of a bowel cancer blood test using advanced machine learning algorithms and the incorporation of information from

Lead Research Organisation: Swansea University
Department Name: College of Science


the patient's medical record. This project concerns the extended development of a blood-based diagnostic for bowel cancer by factoring in patient record information. In Europe, colorectal cancer (CRC) is the second most common cancer, with approximately 450,000 new cases per year. CRC is the 3rd most common cancer in the UK with 60% presenting at a late stage, III /IV. Early diagnosis makes a significant difference to survival rates. The Swansea Biospectroscopy group led by Professors Harris and Dunstan have developed an effective blood test based upon laser spectroscopy. The test utilises machine learning algorithms and is trained on spectral pattern recognition to optimise the diagnostic for early detection. It is possible to further improve the development of the algorithms to include patient record information where additional patient factors can be used to reduce false positives and eliminate false negatives. In particular the effect of co-morbidities, clinical features including age and family history of cancer and the patient's current medication are key factors which the project will aim to incorporate. For further advancing the early diagnostic potential of the test the project can also develop its diagnostic accuracy in the detection of polyps and identify those patients most likely to develop malignancy. The findings of this study will advance the translation of a blood-based diagnostic for bowel cancer into the healthcare system. It is anticipated that the doctoral researcher will become highly skilled in HPC and the development of appropriate machine learning codes using the infrastructure offered by the CDT.

The ALPHA Antihydrogen experiment makes use of several particle detector technologies, including a Silicon Vertex Detector, Time Projection Chamber, and a barrel of scintillating bars. One of the key challenges for these detector systems is to distinguish between antihydrogen annihilations and cosmic rays, a classification problem machine learning can do excellently. Presently this task is done by the use of cuts based on two high-level variables from the detectors for online analysis, and boosted decision trees with high-level variables in offline analysis. High-level variables are a powerful tool for discrimination, however they are slow to pre-process. The challenge of this PhD project is to build both online and offline analyses that have different processing budgets. Initially, the plan is to investigate the application of modern machine learning techniques, such as deep learning, to attempt to beat the current cutting edge decision tree analysis used by the collaboration. Subsequently, the project will expand to look at replacing the high-level variables with lower level variables to reduce pre-processing time. Ultimately, a small enough model that can interpret raw detector output can make a real-time online analysis, with the final goal of programming an FPGA or micro-controller to perform accurate, real-time classification of detector events. The combination of these projects would build a robust and comprehensible thesis that investigates machine learning applied to particle detectors. Demonstration of micro-controller and FPGA level classification would have a large impact for the particle detector community contributing to detector trigger systems and live diagnostics beyond the scope of the ALPHA experiment.


10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S023992/1 01/04/2019 30/09/2027
2430831 Studentship EP/S023992/1 01/10/2020 30/09/2024 Natalia Sikora