Learn From The Best: training AI using biological expert attention

Lead Research Organisation: University of Nottingham
Department Name: School of Computer Science


Artificial intelligence (AI) is having a massive impact on many disciplines, including biological science. Its power is impressive and its adoption will change the nature of research, but at the moment the way it is developed has severe practical limitations. Despite the recent developments in machine learning and AI, humans still possess an unrivalled ability to just look at a picture, and understand exactly what is going on. A human expert is able to look at a picture of a plant with disease symptoms, for example, and immediately quantify the severity of the infection. AI promises to revolutionise bioimage analysis, but as of today an expert human will outperform an AI given only a small set of images to learn from.

One important difference between humans and modern AI is the way we are taught to perform a task. A human will learn which parts of an image are important, then scan the images to find these areas before coming to reach a scoring decision. AI is typically trained using labeled data, where only the output label matters. An AI does not know which parts of the image are important, or where it should look. This often leads to poor performance when the task is challenging, or when only small datasets are available. To achieve the impressive results as has been documented in the news, current AI must use very laborious and inefficient training processes, which are often impractical in a real world scientific setting.

This project will develop a new, smarter way to train artificial intelligence methods, using similar mechanisms to how human experts make decisions. To do this, our AI will study how human experts approach the same problems by using gaze tracking to see where an expert looked, and when. The result will be AI methods that learn to look in the right places, and so are able to take more difficult scoring decisions with less training data than they would previously need. Put simply, we believe that an AI that is able to look in the correct places before making a decision will be more effective than one that attempts to simply make a decision without knowing where to look.

In this project we will first develop the hardware and software approaches necessary to capture expert human gaze during image scoring. This raw gaze information will be processed using novel algorithms, and fed into a new deep learning AI system along with the labelled scores, guiding it towards more informed decision making. The AI will examine where in the images human experts looked when providing an image label, and will learn to look in those same places when it replicates the same task. This is a new approach to training AI. Finally, we will build a new type of deep neural network AI system that can be guided by this additional information, knowing where to look, and what to do.

We will demonstrate this work on important datasets of plant disease, but we also believe this approach will massively reduce the time required to annotate datasets across all fields of life and biomedical science, and at the same time produce even more impressive and accurate AI results. This could represent a step-change in the adoption and ease of use of AI tools in the world of bioscience, allowing for more efficient training on smaller image datasets.

Technical Summary

Human experts can become very good at making scoring decisions on images. High throughput phenotyping experiments demand trained experts make decisions quantifying properties in images, for example how diseased a specimen is, or deciding if a sample is showing a particular response. Deep machine learning allows for computers to make similar such decisions; but requires a lengthy and precise image annotation process in order to train.

One key difference between a human expert and a trained deep network is the human's ability to identify the important locations within an image first, and only then combine these into a score or prediction. AI is typically trained in a supervised manner, a series of example images and output labels are provided in the hope the AI generates a reliable mechanism of mapping from image to score. This is often not the case; when no explanation is provided during the training process for what parts of an image are important (as during semantic segmentation), the accuracy of the trained AI is limited.

What if we could guide a deep network to first identify the important parts of an image, as a human expert does? This project will do this by combining traditional annotated labelling of images with expert eye tracking information captured cost-free at the same time. Human gaze can now be captured accurately and in real time using low cost eye tracking hardware. This gaze information will be used to train the network to identify where the expert was looking when scoring. These regions will be used to make more informed, more accurate decisions. We will demonstrate the improvement attentional training can provide on three varied datasets featuring plant disease, but the approach will remain general, applicable to any scoring task within the biosciences, and further afield in areas such as medical diagnosis.

Planned Impact

The project aims to deliver three main outputs:
1. Develop a new method to collect and combine eye tracking data with deep learning to produce more efficient deep networks
2. Demonstrate the approach on three plant phenotyping datasets examining disease severity
3. Publish and disseminate this method via the use of our novel software and cheap, commercially available eye tracking hardware, including to a wider audience outside plant science

Who will benefit from these outputs, and how?
Research scientists
As detailed in the Academic Beneficiaries, both bioscientists, computer scientists and scientists from further afield will benefit from the resource in a wide variety of ways. We anticipate equal benefit from industry and academic research alike. Specifically, the new approach will enable faster, cheaper and potentially more accurate AI development for a wide variety of phenotyping and diagnostic experiments.

Any industrial application that involves deep learning across image datasets with text labels will potentially benefit from this approach, both in and outside of bioscience. For example, commercial, high throughput phenotyping systems will directly benefit from the reduced training requirements of the approach. Phenotyping as part of breeding programs now has the potential to become fully high throughput and easier to retrain across varying research questions and domains.

Biomedical Imaging
As a team we are excited about expanding the approach to cover biomedical image datasets, and specifically the task of clinical diagnosis via machine learning. Again providing an easier mechanism to train deep learning will be of great assistance in a research area where AI is already making a sizable impact. Combined with the explainable aspect of the networks, it will allow for software which can show where an AI looked in an image when making a decision.

The public will benefit through numerous outreach activities throughout the project. Eye tracking will be demonstrated via an interactive demo where we can show first hand the power of seeing where someone is looking when examining an image. This will allow us to discuss both the technology involved (eye tracking) the AI concepts in development (deep learning) and the underlying science (plant phenotyping as it relates to global food security)

Training of PDRAs
The project promises to deliver excellent impact through training of the project team and the public. The biology PDRA will gain experience working in a multi-disciplinary environment, and working knowledge of modern approaches to machine learning and AI. These are extremely valuable skills. The CS PDRA will gain experience developing software tools for a wide audience, with specific and different needs to the computer science research community. They will also likely gain new skills working with eye tracking hardware.

The project will provide code and software allowing people to gain experience in the collection of attentional data, and training of models. A workshop alongside the project will explore its use beyond the plant science exemplar datasets used here. Additional online training will be provided in the form of a tutorial video, in a similar format to those prepared previously by the investigators in the project team.


10 25 50
Description This research has explored the use of human gaze to better inform the training of AI models. These models have been used to detect and quantify the amount of plant disease in crops. Locating areas of interest in images can be time consuming, particularly because it requires careful annotation from an expert in order to guide the training of a network. In this research we captured the gaze of the experts while they were performing normal expert tasks, essentially obtaining the gaze "for free" with minimal overhead. We then trained deep networks to predict the same disease as the experts, but using the gaze information to help. We found that the accuracy of the prediction from the AI was much improved when gaze was included, showing that on small datasets we can use gaze to better inform the training process.

The publication of this work has been delayed by a few months due to Covid-19, but the manuscript is in preparation and will be submitted soon.
Exploitation Route This work could potentially be used in many fields where expert scoring - going from an image to some score or label - is required. We will publish this work and engage with other research communities on this.
Sectors Agriculture, Food and Drink,Digital/Communication/Information Technologies (including Software),Healthcare,Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology

Description Delivering tutorial on multi-task deep learning for PhenomUK Workshop, Edinburgh 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact Invited to deliver a formal workshop tutorial on multi-task deep learning in Edinburgh, funded by PhenomUK. Multi-task learning is one of the approaches utilised during this project, and the work will enform my teaching of this short course.
Year(s) Of Engagement Activity 2022
URL https://www.phenomuk.net/event/ai-for-plant-image-analysis-where-we-are-and-whats-next-full-day-work...