Modelling Human Genomic Variations Using Markov Random Field: A Feasibility Study

Lead Research Organisation: University of Manchester
Department Name: School of Medical Sciences

Abstract

Humans are different, and those differences are encoded in our genome. However, not all the positions in the genome contain differences, and not all the differences occur independently of each other. Evolution, migrations and chance have resulted in certain combinations of variants being more frequent than others. We plan to demonstrate that a probabilistic approach called Markov Random Field can be useful for modelling human genomic variation. First, we will computationally simulate "synthetic" data that resembles true genomic data: we will simulate how human populations expanded, migrated and mixed, and how survival and the possibility of reproduction depends on health and fitness. Then, we will use the Markov Random Field approach for modelling the simulated variation. The main aim will be to identify co-dependant genomic variants (i.e. combinations of variants that occur more often than expected). Having identified the co-dependencies, we will aim to discriminate between types of combinations of variants. Some combinations of variants might be frequent because variants are close in the genome, and hence are usually passed from parents to children. Other combinations may just reflect the distribution of variants across different subpopulations. Finally, other variants may be co-dependant because there is a synergistic fitness effect between those genomic positions; some combinations of variants would be beneficial, whereas others might be detrimental. After having demonstrated the use of a Markov Random Field in modelling simulated human genomic variation, we will do a pilot project studying genomic variation observed in a cohort of half-million individuals collected in the United Kingdom. The long-term aim of this project is to use this modelling approach in the identification of genomic variants associated with genetic diseases.

Publications

10 25 50