Developing computational tools for variant calling in single cell RNA sequencing

Lead Research Organisation: University of Oxford
Department Name: Interdisciplinary Bioscience DTP

Abstract

The way to detect somatic mutations is to sequence and compare healthy and disease samples from the same patient. The difficulty lies in filtering out technical errors, especially when dealing with small samples. The emerging field of single-cell RNA sequencing is particularly prone to errors, due to obtaining information from individual cells rather than averaging across the whole bulk. This, however, provides a much better resolution of functioning of single cells in the context of their microenvironments, and therefore requires specialised methods of analysis. As the tools currently available for mutation analysis from single-cell RNA sequencing are still ineffective, it is of great importance to introduce novel ideas in order for this type of sequencing to reach its full potential. Throughout the DPhil, I will therefore combine Machine Learning, statistical approaches and existing bioinformatics tools in order to develop algorithms to detect bona-fide mutations from single-cell RNA sequencing data. Once this is achieved, we are hoping to expand our software to single-cell "TAPS" sequencing, a new DNA sequencing technique that is currently under development at the Ludwig Institute. This will allow us to gain an even more detailed insight into functioning of individual cells at particular stages of their cycle and in a variety of environmental conditions.

BBSRC priority area: Data Driven Biology
This priority area aims to encourage the development of the bioinformatics tools and computational approaches required to generate new biological understanding from the huge volume and diversity of data now available. The main challenges of broad data driven research relevant to this DPhil project include integration and analysis of large or complex datasets from a variety of sources, new data visualisation approaches and development of effective tools for the analysis of data produced using the newest experimental techniques. Data driven biology priority area also encourages exploitation of advanced technologies like high performance computing and combining computational solutions with development of data-generating biological technologies in order to make raw data more reliable with a reduced need for advanced data pre-processing techniques.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
BB/M011224/1 01/10/2015 31/03/2024
2108183 Studentship BB/M011224/1 01/10/2018 30/09/2022