Reverse engineering cell competition using automated microscopy and recurrent neural networks

Lead Research Organisation: University College London
Department Name: Structural Molecular Biology

Abstract

The aim of this project is to use state-of-the-art machine learning (ML), automated time-lapse microscopy, and proteomics to understand cell competition. Cell competition is a phenomenon that results in the elimination of less fit cells from a tissue - a critical process in development, homeostasis and disease. The viability of loser cells depends strongly on context: when they are cultured alone, they thrive, but when in a mixed population, they are eliminated by cells with greater fitness. In development, competition acts as a quality control mechanism and also participates in pattern formation. In ageing, competition may eliminate senescent cells from tissues to prevent age-related pathologies. In stem cell niches, competition may determine which cells differentiate and which remain pluripotent.

A number of mechanisms of cell competition have been identified to date involving either biochemical competition (for example through competition for pro-survival growth factors) or mechanical competition (for example a fast growing clone compresses cells in a slow growing clone, which results in cell extrusion for the now denser slow growing clone).While competition was initially thought to take place only at the interface between cell lineages, the discovery of mechanical competition revealed that this is not necessarily the case and that extrusion may take place several cell diameters away from this interface.

To date, the vast majority of studies have examined the biochemical mechanisms of competition in single cells and competition at the population level, however it is becoming increasingly clear that the topology of the tissue plays a central role in determining the outcome of competition. Despite this, cell competition remains poorly understood -- we do not know the interaction "rules" that determine each cell's fate. This is largely because most studies only quantify whole population shifts for very few time points and for few cells.

One major obstacle to understanding how population shifts occur as a result of single cell behaviours is that it requires thousands of cells to be tracked over hundreds of time points. To address this challenge, we recently built the first deep learning and automated single-cell microscopy system to analyse cell competition. We used deep convolutional neural networks to analyse the cell cycle state of millions of single cells in mechanical competition, including cell division and death.

In this project, we will use the full scope of the information contained in our time-lapse data to determine the physical and topological parameters that govern cell competition. We will develop a deep learning approach to extract time-dependent features of a single-cell's environment that predicts its fate in biochemical and mechanical competition. We will use the ML model to determine what physical and topological features govern cell competition. We will combine ML and proteomics to identify proteins involved in the commitment pathway, determine their hierarchy in the signalling cascade, and identify convergent pathways.

Technical Summary

The aim of this project is to use state-of-the-art machine learning (ML), automated time-lapse microscopy, and proteomics to understand cell competition. Cell competition is a phenomenon that results in the elimination of less fit cells from a tissue - a critical process in development, homeostasis and disease. The viability of loser cells depends strongly on context: when they are cultured alone, they thrive, but when in a mixed population, they are eliminated by cells with greater fitness.

The proposed research will seek to understand of how cell fate is determined by local interactions within heterogenous cell populations and determine signalling pathways leading fate commitment in competition. We will use our automated time-lapse microscopy platform, combined with deep-neural networks (in particular LSTM recurrent neural networks) to extract time-dependent features which can be used as early markers of cell competition. We will use the RNN, combined with quantitative imaging of molecular markers and proteomic discovery to identify proteins involved in the commitment pathway, determine their hierarchy in the signalling cascade, and identify convergent pathways.

Overall, these aims will allow us to develop the potential of ML for biological discovery. Our strategies will be applicable to any fate commitment such as during development, in stem cells, or cancer.

Planned Impact

The proposed research will seek to understand of how cell fate is determined by local interactions within heterogenous cell populations and determine signalling pathways leading fate commitment in competition. For this, we will develop new experimental and analytical strategies to leverage the potential of deep learning approaches for biological discovery. This proposal will primarily benefit academics in the fields of cell biology, stem cell biology, and cancer biology but, in the longer term, through comprehension of the factors underlying fate commitment, we envisage that our research will benefit the pharmaceutical industry in the UK and clinical medicine. We plan a number of activities that will achieve impact and these are articulated via the following deliverables and impact activities.

Academic advancement and innovation:
We expect our research to attract interest from many fields in the global scientific community such as cell biology, developmental biology, stem cell biology, biophysics, proteomics and machine learning applied to fundamental biological questions.
To ensure our findings have the highest possible impact, we will present our preliminary results at high profile conferences that cover relevant topics including cell biology, biophysics, and machine learning throughout the duration of the grant. Where possible we will disseminate our findings in general audience journals.

Training and professional development:
Both GC and ARL are actively involved in interdisciplinary training activities at UCL. GC and ARL participate in teaching in the CoMPLEX DTP (UCL Centre for Computation, Mathematics and Physics in the Life Sciences and Experimental Biology) and are members of the LiDO interdisciplinary BBSRC CDT. Both PIs are involved in the development of a MSc and PhD program in the newly founded Institute for the Physics of Living Systems at UCL. The project described here will be used to introduce students from different backgrounds to interdisciplinary research in the Life Sciences. Elements of the work will be used as projects for students in the CoMPLEX and LiDO CDTs.
Throughout the project, the PDRA involved will receive cross-disciplinary mentoring and benefit from regular interactions both in the LCN and ISMB. In addition, they will be involved in mentoring students and develop their own mentoring and leadership skills. This will aid their progression towards an independent group leader position. We will also provide three short summer internships for undergraduate researchers to be involved with the generation of training data for the machine learning models. This is valuable research experience for the young scientists.

Commercialisation and exploitation:
We envisage that, in the longer term, our integrated experimental, analytical and machine learning approach will be of interest to clinical medicine, bioengineering start-ups, and the pharmaceutical industry. Indeed, we anticipate that our approach could be utilised to design new therapeutic treatment courses. In addition, we anticipate interest from the stem cell community because our long-term imaging and analysis pipeline will allow identification of circulating stem cells. Should there be industrial interest, we will study the possibility of designing an approach suited to high throughput screening.
UCL has efficient mechanisms to assist academics in the development of commercial applications of their research outputs and in the management of intellectual property rights (see for instance UCL Business).

Increasing public engagement and understanding:
Previously members of the team have been involved in interactions with the wider community through public discussions, news organisation interviews, and school visits. Through this type of outreach we expect this work to reach a wide audience, giving the public a better understanding of multidisciplinary research and an appreciation of the remarkable natural world in which we live.
 
Description * We generated large new experimental image and proteomic datasets.
* We created new computational tools which enable analysis of these image datasets. In particular we developed tools to track cells over time. We published these tools and they are in use by the broader scientific community.
* We explored the use of artificial intelligence to learn new insights from biological data. We developed new computational approaches that can learn explainable models of cell behaviour and that these models can be used in applications such as drug screening.
Exploitation Route * Drug screening technologies may be applicable to biopharmaceutical industry
* Computational tools developed are in use by the broader scientific community
* Experimental datasets have been used by other research groups to develop new computational tools
* We have demonstrated that AI can be used to learn basic principles from a scientific dataset, paving the way for future efforts to use AI for science.
Sectors Chemicals,Digital/Communication/Information Technologies (including Software),Healthcare,Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology

 
Description Development of napari plugins to enable single-cell tracking
Amount $20,000 (USD)
Funding ID 2021-240313(5022) 
Organisation Chan Zuckerberg Initiative 
Sector Private
Country United States
Start 12/2021 
End 07/2022
 
Title Cell tracking reference dataset 
Description Raw microscopy images and associated data used for validation of cell tracking algorithms. 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
URL https://rdr.ucl.ac.uk/articles/dataset/Cell_tracking_reference_dataset/16595978/1
 
Title Cell tracking reference dataset 
Description Raw microscopy images and associated data used for validation of cell tracking algorithms. 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
URL https://rdr.ucl.ac.uk/articles/dataset/Cell_tracking_reference_dataset/16595978
 
Title cellX-predict datasets 
Description Machine learning dataset for cellx-predict 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://rdr.ucl.ac.uk/articles/dataset/cellX-predict_datasets/16578959/1
 
Title cellX-predict datasets 
Description Machine learning dataset for cellx-predict 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://rdr.ucl.ac.uk/articles/dataset/cellX-predict_datasets/16578959
 
Description cellX-predict software 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
URL https://rdr.ucl.ac.uk/articles/software/cellX-predict_code/19207923
 
Description cellX-predict software 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
URL https://rdr.ucl.ac.uk/articles/software/cellX-predict_code/19207923/1
 
Description Invited talk at CellBio 2022 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Invited talk at ASCB CellBio2022
Year(s) Of Engagement Activity 2022
URL https://www.ascb.org/cellbio2022/
 
Description Invited talk at CytoData 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Invited talk at CytoData symposium at the Allen Institute, USA.
Year(s) Of Engagement Activity 2022
URL https://alleninstitute.org/what-we-do/cell-science/events-training/cytodata-symposium-2022/
 
Description Talk at HHMI Janelia Research Campus 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Invited research seminar at HHMI Janelia Research Campus
Year(s) Of Engagement Activity 2022