Deep learning of Fungal Phenotypes from High-Throughput Imaging

Lead Research Organisation: Imperial College London
Department Name: Life Sciences

Abstract

The aim of this research project is to develop cutting-edge software architectures that are able to infer the health states of microorganisms from their phenotypes, and subsequently exploit this information to inform drug design. Only in the last half decade have advances in the field of deep machine learning - adaptive algorithms which learn via mechanisms inspired by the human brain - made this a plausible goal. The proposed technology has the potential to impact on the medical and agricultural sectors, reducing costs and improving analysis efficiency and accuracy.

Initially, the subject will be the fungus Phakopsora pachyrhizi. This is an aggressive pathogen that can cause devastating yield losses of up to 80% in infected crops, and is thus `one of the most economically important soybean diseases in Asia'. Four sets of images of the fungus have been provided by agrochemicals company Syngenta (with more data on the way); three contain images of fungi treated with different experimental fungicides, and the fourth is a control. The current project aim is to identify the fungal phenotypes under each treatment. The hope is that there will be fewer fungal phenotypes than fungicide treatments, and so it will be possible to link the actions of the different drugs, and classify them into groups with similar attacking mechanisms.

The specific technology used will be multi-stage image processing and deep machine learning, specifically convolutional neural networks (CNNs). These networks are capable of abstracting the information in an image into numerous representations, and then mapping the pixel values across all representations to a class (one of the four detailed above, with the current datasets). So far, transfer learning has been used. This involves using the feature extraction skills already established by an extremely complex CNN built by Google (InceptionV3) and trained on millions of labeled images. The fungal images are propagated through the network, and then the end layers are personalized for the specific classification task in hand.

There are many other promising architectures in the literature that will be employed on top of this base analysis, in order to compare the phenotypes and inform the drug design process, for example de-convolutional neural networks (DNNs) and unsupervised deep learning. DNNs reveal the key image features responsible for a classification choice, by inverting the learned CNN structure. Unsupervised models learn without labeled training data, and so infer classes by recognizing similarities (quantified by densities in feature space). These models could be used to both infer clusters of similar phenotypic states (inputs could be synthetically created archetypal morphologies from each treatment class), and also to create 3D virtual morphologies from the 2D images provided, a technology which has been pioneered with good results on human faces.

We also hope to incorporate ideas from theoretical physics, such as those explored in information theory and statistical mechanics, where opportunities arise.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
BB/M011178/1 01/10/2015 25/02/2025
2133450 Studentship BB/M011178/1 29/09/2018 23/12/2021