Learning signalling pathways from single-cell RNA profiles of CRISPR perturbations

Lead Research Organisation: University of Cambridge
Department Name: Cancer Research UK Cambridge Institute

Abstract

How is information flow in the cell organised? How are outside signals transferred to the cell nucleus to turn transcriptional programs on or off? The proposed research addresses these questions by combining data from novel experimental techniques, which have only been published in the last few months, with an established computational approach pioneered by the applicant.

The novel experimental techniques use a gene editing method called CRISPR to perturb genes in a cell and then measure the gene expression response in it using single cell RNA sequencing. By using many perturbations in many cells the data give a comprehensive picture of the effects of gene perturbations and thus of what function the genes have in the cell. The data fit perfectly to a computational method the applicant has developed to infer gene interactions and pathways from the expression effects of gene perturbations. The method is called Nested Effects Models. Over the last 12 years the method has been very well developed and many key ideas have been introduced in different applications (where genes were perturbed differently or effects were measured differently). But the key ideas are there and can now be translated to the new data from single cell RNA seq CRISPR screens.

The goals of the project are, first, to understand the features of the new type of data better and make sure that perturbation effects can be estimated robustly. Second, to tailor NEMs to the specifics of these new data. Third, to understand which effect different experimental parameters have and thus be able to design better experiments in the future. And finally, in collaboration with leading experimental scientists, to use the methodological advances to gain new insights into biology. Two case studies will be on regulatory networks in T helper cells and on how the JAK-STAT pathway shapes epigenetic landscapes.

Technical Summary

Several recent high-impact papers introduced experimental techniques for single-cell based genetic screens to understand gene function and cellular signalling pathways. These techniques combine single-cell RNA sequencing (scRNA-seq) and clustered regularly interspaced short palindromic repeats (CRISPR)-based perturbations to massively scale up the resolution and scope of previous genetic screening technologies. In a first step, CRISPR vectors deliver guide RNAs (gRNAs) targeted at particular genes to a pool of cells or to one well of an array. In a second step, the cells carrying the different perturbations are then RNA sequenced to measure transcriptional effects of the perturbations, which provides information on gene function and pathway activity. The technology is flexible and will most likely soon be used very widely across molecular biology. As these technologies are brand-new, tailored computational analysis of these data is lagging behind experimental advances.

Here I propose a machine learning approach to efficiently analyse scRNA-seq CRISPR screens and infer gene interaction networks and pathways of information flow in the cell. Our approach is based on an established machine learning method called Nested Effect Models (NEMs), which has been pioneered by the applicant. NEMs are built on inferring subset relations and thus are complementary to other graphical models like Bayesian networks and Gaussian graphical models. Over the last twelve years NEMs have been refined, extended, and applied by a world-wide community of independent groups, and now there exists a substantial body of methodological developments and experience in applications, which we propose to leverage for the analysis of scRNA-seq CRISPR screens. Working with leading developers of scRNA-seq CRISPR screens, we will use our methodological advances to optimise the study design of future screens and showcase the power of our approach in collaborative case studies.

Planned Impact

This proposal integrates cutting-edge experimental techniques (single cell genomics, CRISPR) with a powerful machine learning approach (Nested Effects Models). It uses novel experimental techniques and innovative computational analyses in an inter-disciplinary approach. Thus, a key group of beneficiaries are academic researchers and this proposal contributes to worldwide academic enhancement.

The proposed research combines machine learning and computational biology with applications in immunology and epigenomics. Another group of beneficiaries are early career researchers just after their PhD, because this proposed project helps to train them to become highly skilled researchers. The proposed project thus enhances the knowledge economy of the UK. The skills they will learn are highly sought after both in academia and in industry.

The research outcomes of the project are a better understanding or regulatory networks and signaling pathways on a single cell level which can translate into improving health and well-being. Other potential beneficiaries are thus the biomedical community and patients, because the improved understanding of basic biological mechanisms provided by the research proposed here might translate into new drugs against these mechanisms in disease.

Other beneficiaries come from the general public and we will work on increasing public engagement with research by giving talks to lay people, for example charity supporters who visit our institute regularly.

Publications

10 25 50
publication icon
Cheng Z (2022) The Genomic Landscape of Early-Stage Ovarian High-Grade Serous Carcinoma. in Clinical cancer research : an official journal of the American Association for Cancer Research

publication icon
Hosseini SR (2019) Estimating the predictability of cancer evolution. in Bioinformatics (Oxford, England)

 
Description This grant funded our work into developing novel methods to analyse a particular type of data that shows what effect gene perturbations have on other genes. Perturbations were done by CRISPR and effects were measured by single cell RNA seq. The methods we developed integrate the different perturbation effects into a functional network that can tell us how the different parts of the perturbed biological process work together to achieve the observed outcome. Our first main achievement is a comprehensive method benchmarking to explore the power and limitations of existing approaches and to define where new developments are needed. We are currently writing this up and will submit soon. Based on this benchmarking, our second achievement is a new network inference method.
Exploitation Route We will apply our methods to experimental data derived in my own lab and by collaborators. We make all our code available and so every researcher in this field can use our approaches to their own perturbation data.
Sectors Pharmaceuticals and Medical Biotechnology

 
Description CRUK Future Leaders Training Event, 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact contributing to the event, for two elements of the day I am involved in organising:
• A speaking opportunity at 12pm on the day, for yourself and another senior researcher with 'CRUK heritage'- telling the story of how CRUK has helped them. The session would be 1 hour which includes a talk and then the opportunity for Q&A from the Philanthropy audience:
o Overview / introduction to your career / background
o What CRUK funding has done for you?
o What other support you've had?
o What sets CRUK apart in your view?
• The other opportunity is a small poster session from 2pm-3pm on the day of the event, where we would like to have a handful of PhD students speak informally with small groups of our fundraisers about their work and experiences as a PhD student. The majority of our fundraisers do not have a science background themselves but are looking to learn and be inspired by the work, so this would also be a good opportunity for the students to practise speaking about their work to a lay audience. Would you be happy to suggest a PhD student within your group, who I could ask to be involved with this part of the event?
Year(s) Of Engagement Activity 2023
 
Description Cambridge Computational Oncology Meeting 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact Cambridge Computational Oncology Meeting with three talks presented at CRUK CI on November 4, 2022 at CRUK CI.
Year(s) Of Engagement Activity 2022
 
Description DiMeN: Responsible Research and Quantitative Methods: virtual workshop at Leeds University 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact Florian gave a talk for this virtual two-day workshop: Topic: Keynote: Five selfish reasons to work reproducibly. To help raise awareness of UKRN to early career researchers across the North of England. This took place 29th - 30th March 2022
Year(s) Of Engagement Activity 2022
URL http://www.dimen.org.uk
 
Description Good Lab Practice talk 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact Florian gave a talk on 'Five Selfish Reasons to work Reproducibly' as part of the Good Lab Practice series held for mainly postgraduate students. Undergraduate students are also always welcome at CI.
Year(s) Of Engagement Activity 2022
 
Description Postgraduate Talk on 'Critical Reading' on 8 November 2022 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact Florian gave a talk on 'Critical Reading' as a member of the Postgraduate Committee at CRUK CI on 8 November 2022.
Year(s) Of Engagement Activity 2022
 
Description Scientific Writing Day 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact Florian participated in a Scientific Writing Day and presented 'Writing an Abstract' along with other colleagues on 17 May 2022.
Year(s) Of Engagement Activity 2022
 
Description Talk on Health-AI in cell and molecular biology 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact The Health-AI Summit, funded by the Technology Mission Fund, will be a 1-day meeting on the 20th February 2023 at The Alan Turing Institute, which aims to:
• identify critical challenges within the domains of cell and molecular biology;
• share knowledge on data assets and novel methodologies to address those challenges;
• identify areas where open data science and AI can make the greatest difference, towards improving health outcomes for all.

Collectively we will focus on identifying areas of alignment between health challenges and open data science, towards creating a shared vision for the future of Health-AI, aligned to areas where the potential additionality is greatest.

Your input on the day will help to shape a roadmap for Health-AI in cell and molecular biology, which will be published and inform potential future calls through subsequent Technology Mission Funds. More information is available on our website: Scoping the landscape for the future of AI and machine learning in health | The Alan Turing Institute

Attendees have been cordially invited to ensure we have a breadth of expertise and organizations represented.
Year(s) Of Engagement Activity 2023
URL https://www.turing.ac.uk/research/research-projects/scoping-landscape-future-ai-and-machine-learning...
 
Description Title: A pan-cancer compendium of chromosomal instability 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact ICCB Seminar Series on Computational Cancer Biology at the University Hospital Cologne on November 3, 2022. Our lab develops and applies algorithms and computational methods to understand how cellular and intra-tumour heterogeneity (ITH) arises and how it affects tissue and patient phenotypes in space and time. We are particularly interested in chromosomal instability (CIN) and somatic copy-number alterations (SCNA), a key characteristic that separates cancerous from healthy somatic tissue. In our methods we leverage statistical and machine learning approaches as well as classical computer science algorithms and simulations and develop these models in close collaboration with our experimental partners.
Year(s) Of Engagement Activity 2022
URL https://iccb-cologne.org/groups/schwarzlab/profile
 
Description Turing-Roche Predictive Modelling Workshop 11 October 2022 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact At the workshop we're also exploring 3 predictive modelling sub-themes
Year(s) Of Engagement Activity 2022
URL http://turing.ac.uk