Fine mapping the results of a multiple sclerosis whole genome screen for association

Lead Research Organisation: University of Cambridge
Department Name: Clinical Neurosciences

Abstract

Multiple sclerosis is a disease of the brain and spinal cord that affects more than 60,000 people in the UK. After trauma it is the commonest cause for chronic neurological disability in young adults. Although the cause of the disease is unclear it is well known that genetic factors influence susceptibility to the condition. We believe that identifying the relevant genes will provide invaluable information about the nature of the multiple sclerosis, which will ultimately lead to more effective treatments or perhaps even enable us to prevent the disease.
Finding the handful of genes that influence susceptibility to multiple sclerosis is a little like looking for a needle in a haystack; the effects of each relevant gene are modest while the number of potentially relevant genes is large (there are around 30,000 human genes). In the past researchers were only able to look at a few genes at a time and were therefore forced to try and predict which genes were most likely to be relevant, this strategy did not prove to be at all successful. Fortunately advances in technology mean that it is now possible to be considerably more systematic and interrogate many thousands of genes simultaneously. Anticipating that this would become possible the Medical Research Council (MRC) previously funded us to collect DNA from a set of patient volunteers. Using these and other samples we joined the Wellcome Trust Case Control Consortium (WTCCC) and tested 15,430 so called ?single nucleotide polymorphisms (SNPs)? in these samples. An SNP is a change in the genetic code of a gene. These SNPs were carefully chosen and come from all over the genome. As part of the WTCCC team we tested 994 patients and 1,476 controls with these variants, a total of 38 million individual analyses. Comparing cases and controls for each of the SNPs has enabled us to identify 729 showing evidence of potential relevance in multiple sclerosis. It represents a huge effort to get to this stage.
In this proposal we are asking for support to test these 729 short-listed SNPs in a new set of cases (1,000) and controls (1,500) in order to refine the list and finally identify genuinely associated genes. We will then interrogate the most strongly implicated gene in fine detail in a third set of 1,000 trio families (an affected individual and their parents).

Technical Summary

Epidemiological analysis has consistently shown that genetic factors have a profound influence on susceptibility to multiple sclerosis. Association with the class II Human Leukocyte Antigen (HLA) gene DRB1 has been firmly established, and constitutes the largest single effect underlying susceptibility. Systematic screening for linkage has shown that other relevant susceptibility genes are likely to exert only modest individual effects (increasing risk by a factor of no more than 2.0) and will therefore require large association based studies if they are to be identified.
Anticipating this requirement the Medical Research Council (MRC) awarded us a grant to establish a collection of DNA samples from well-characterised patients for use in genetic studies (http://www.dna-network.ac.uk). During the period of this grant we joined the Wellcome Trust Case Control Consortium through which we have been able to type 15,430 non-synonymous SNPs in 994 cases and 1,476 controls (38 million genotypes). As expected the most associated markers come from the Major Histocompatibility Complex (MHC) and have alleles in linkage disequilibrium (LD) with the multiple sclerosis associated DRB1*1501 allele. Excluding these expected MHC hits there are a further 729 markers showing nominally significant evidence for association (p 0.05), including rs6897932 from the IL7R gene, a SNP recently implicated and replicated as relevant in determining susceptibility to multiple sclerosis.
We propose to refine our list of 729 potential associations by extending the typing these variants into a second independent cohort of 1,000 cases and 1,500 controls. This extension analysis will enable us to select out the genuinely associated variants and exclude the false positive associations. We will then intensively interrogate the most strongly implicated gene. Initially we will resequence the gene in 50 cases from multiplex families in order to supplement the publicly available catalogue of variation in the gene. Informative variation from across the gene will then be typed in both original cohorts and an additional third cohort of 1,000 trio families (an affected individual and their parents) in order to construct an association map across the gene. By sharing our discoveries with colleagues from across Europe and the US we will be able to explore how associated variants are distributed between populations. We will also interrogate functional aspects of associated variants.
We believe this work will be extremely informative with regards to the pathogenesis of the disease and will thereby guide future research towards treatment and perhaps even prevention.

Publications

10 25 50