Mathematical and bioinformatics based tools to explore the impact of gene editing on the geometric principles governing the 3D structure of the genome

Lead Research Organisation: University of Bath
Department Name: Biology and Biochemistry

Abstract

Gene editing can have unforeseen consequences on the expression of genes adjacent to the targeted gene. Given the prospect for clinical gene editing to repair disease mutations, using CRISPR/Cas9 technology, there is a clear need to understand the impact of these technologies on the 3D structure of the genome. As such this project will apply and develop mathematical and bioinformatics based tools to explore the geometric principles governing the 3D structure of the genome from large-scale 'omic' data sets for human epigenetic profiles and state-of-the art imaging technologies. We will then examine the impact of targeted gene editing on the 3D genome structure.
Specifically this project will first investigate the topological structure of the genome using chromatin confirmation analysis. Specifically we will apply an allele specific chromatin confirmation analysis strategy, known as Capture HiC (C-HiC). This is a variation of the HiC methodology which allows the genomic region of interest to be enriched and mapped at a much greater resolution than standard HiC methodologies. The genomic region of interest we will be investigating is the DIRAS3/GNG12-AS1 system which displays negative gene non-autonomy. Specifically it is known that the transcription levels of DIRAS3/GNG12-AS1 are inversely correlated during the cell cycle. This negative correlation may be explained via a promoter competition model whereby the promotor activity of one gene reduces the accessibility of the promotor on the other gene. This model necessitates chromatin looping to bring the promotors into close proximity. As such, this suggests that chromatin conformation may play an important role in transcriptional interference and the regulation of clusters of genes.
In addition, the structured environment around the DNA is what defines epigenetic factors of gene expression. There is a growing public interest in epigenetics research and how epigenetics can explain the effects of life style choices on the DNA. There is also a strong public and health interest in how gene editing will affect genetic health. In the first instance the results of our study will form a platform that can be built upon to develop algorithms for predicting which regions of the genome can be safely and more precisely targeted by gene editing technologies. Data driven biology and systems approaches to biomedical sciences provide the means for large scale analysis of multiple cellular and biological features. The novelty of this project is that we will be among the first to use such data to understand how three dimensional structures affect the working of our genome.

Publications

10 25 50
 
Title HICFlow - A comprehensive HiC data analysis workflow in Snakemake 
Description HiCFlow is a Snakemake workflow that is built to automated the analysis of HiC data from raw sequencing reads to publication ready figures and meaningfully interpretable results. The workflow can be configured from a single configuration file, provided by the user. Upon execution, software installation is handled automatically and data is processed to completion. The workflow can also be scaled to software environments. The primary workflow can perform HiC, region capture HiC and allele specific HiC analysis. The workflow can uniquely performed built in SNP calling and haplotype phasing of Hi-C data to build diploid contact matrices even when phased genomes are not available. It incorporates a variety of other published tools to provide a complete pipeline. It is designed with user ability and reproducibility in mind. 
Type Of Material Data analysis technique 
Year Produced 2019 
Provided To Others? No  
Impact This pipeline has been used extensively to process in-house Hi-C datasets and published Hi-C datasets to further our understanding. Ultimately we hope to publish this pipeline and make it accessible to non-experienced users wishing to attempt Hi-C analysis. The URL below will be publicly available once the pipeline is published. 
URL https://github.com/StephenRicher/HiCFlow