Learning representations of facial phenotypes

Lead Research Organisation: University of Oxford

Abstract

The early diagnosis of genetic diseases is critical for the prevention of potential long-term health effects. However, genetic diseases can be exceedingly rare, making quick diagnosis a challenge. Clinical geneticists rely on subtle facial features to determine presence of genetic disease, and there is interest in partially automating this approach using computer vision methods. However, the effectiveness of computer vision algorithms relies on a suitable large training dataset, and effective separation of diseases within the internal representation of the model used. This project aims to firstly synthetically augment the existing data, and then to improve the representation learning of faces in order to improve overall accuracy of diagnosis.Aims and Objectives:
1. Making the most of the small datasets available to create accurate phenotype embeddings: Using the existing patient data, computer graphics techniques will be used to create new artificial data examples. This includes building on existing inverse rendering work to create 3D reconstructions of face structure from 2D images. Using normal map prediction and relighting, convincing novel views of existing patients will be generated. In addition to this, the interpolation of textures and mesh structure will be explored. If successful, this will help train models which have more robust and meaningful representations.
2. Using phenotype embeddings for extracting clinically relevant metrics: A key motivation is to make the embeddings interpretable and to create classifiers which meet the needs of clinicians. The aim is to investigate the state of the art in the machine learning fields of model regularisation and metric learning, in order to obtain insights into how these approaches can be adapted for this problem. Of particular concern is to train models which do not show bias in gender, ethnicity or age of the patient.3. Build models to understand how to properly disentangle kinship in phenotype embeddings: Facial phenotype owes a lot to inherited traits. This project will train models which are able to disentangle kinship from phenotype information and isolate features which correspond to genetic disease. To do this, a dataset of kinship similarity will be used. This links with the other aims of reducing model bias, with the goal of improving the matching of new patients with very rare disease classification, without being influenced by kinship similarity.Methodology: New representation learning methods will be developed which specifically improve the performance of models trained on face datasets, and datasets with long-tailed distributions of data.This project falls within the EPSRC research areas of Artificial Intelligence, Image and Vision Computing, and Medical Imaging.This project will be in collaboration with the Minerva Initiative, which enables ethical sharing of patient phenotype data.

Planned Impact

In the same way that bioinformatics has transformed genomic research and clinical practice, health data science will have a dramatic and lasting impact upon the broader fields of medical research, population health, and healthcare delivery. The beneficiaries of the proposed training programme, and of the research that it delivers and enables, will include academia, industry, healthcare, and the broader UK economy.

Academia: Graduates of the training programme will be well placed to start their post-doctoral careers in leading academic institutions, engaging in high-impact multi-disciplinary research, helping to build training and research capacity, sharing their experience within the wider academic community.

Industry: Partner organisations will benefit from close collaboration with leading researchers, from the joint exploration of research priorities, and from the commercialisation of arising intellectual property. Other organisations will benefit from the availability of highly-qualified graduates with skills in big health data analytics.

Healthcare: Healthcare organisations and patients will benefit from the results of enabled and accelerated health research, leading to new treatments and technologies, and an improved ability to identify and evaluate potential improvements in practice through the analysis of real-world health data.

Economy: The life sciences sector is a key component of the UK economy. The programme will provide partner companies with direct access to leading-edge research. Graduates of the programme will be well-qualified to contribute to economic growth - supporting health research and the development of new products and services - and will be able to inform policy and decision making at organisational, regional, and national levels.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S02428X/1 01/04/2019 30/09/2027
2279638 Studentship EP/S02428X/1 01/10/2019 19/04/2024 Jonathan Campbell