Statistical genetics

Lead Research Organisation: MRC Biostatistics Unit


Most common diseases appear to run in families, but the actual cause of disease is a complex combination of multiple genetic and environmental factors. By identifying the genes that increase the risk of disease, it will become possible to predict how likely an individual is to develop disease, and then intervene on lifestyle and environmental factors to reduce that risk. New drug targets can also be identified from genetics, and therapies directed to the individuals who are most likely to benefit. Identification of disease genes requires analysing thousands of individuals because the effect of each single gene is small. Current studies therefore require collaboration between multiple research groups and statistical methods for making best use of large scale data. We are developing methods for identifying the most likely disease genes after testing every possible gene for a disease effect. We are also devising new ways to exploit data contained within families, and methods for combining information from disease and molecular studies. This work is mainly motivated by collaborative projects in coronary artery disease, schizophrenia, and osteoporosis.

Technical Summary

This programme is concerned with statistical methods for identifying and characterising the genetic risk factors for common disease. Recent technological advances now allow the whole genome to be interrogated for disease association, raising the possibility of personalised interventions and risk assessment, and identification of novel therapeutic targets. We will develop improved statistical methods for the large scale studies that are becoming widespread in this field.||Multiple testing problems arise naturally in whole genome scans. We are developing methods for improving the power to detect true associations in this context while controlling the false positive rate, and are comparing several frequentist and Bayesian approaches to ranking associations arising from a scan. We also address selection bias that arises from estimating the effect sizes of the most significant tests. We are developing consistent methods for unbiased estimation and testing in both genome scan and meta-analysis contexts.||We will extend previous work on family based association analysis to deal with extended sibships with missing parents, and missing genotypes or haplotype phase. This will include the imputation of marker data that has been genotyped in a publically available reference panel but not in the main body of study data. We will develop methods and guidance for combining family data with case control and other samples of unrelated individuals. We will implement robust estimation methods that allow extended haplotype information to be used to impute missing data.||Prior knowledge of biological pathways can add value to genetic analysis. We will develop methods for formally integrating pathway and annotation data with whole genome association data, with the aim of improving power to simultaneously detect multiple variants acting in a related manner.||This work is motivated by several collaborative studies. In coronary artery disease we are finding genetic associations with molecular markers such as platelet response, lipids and C-reactive protein. In schizophrenia we are participating in a large international meta-analysis of thousands of individuals, and performing genome scans in extended pedigrees. In osteoporosis we are mapping the genetic causes of bone mineral density at multiple anatomical sites, using data from twin studies. Other collaborations are in place with groups studying multiple sclerosis, autism and epilepsy.


10 25 50
Title Sibship association methods 
Description Methods and software for association analysis in sibships 
Type Of Material Model of mechanisms or symptoms - human 
Year Produced 2008 
Provided To Others? Yes  
Impact Publication 
Description Statistical method for estimating genetic effects after a genomewide association scan 
Type Of Material Model of mechanisms or symptoms - human 
Year Produced 2009 
Provided To Others? Yes  
Impact Publication awarded annual prize by International Genetic Epidemiology Society for best paper in its journal 
Description Genetics of multiple sclerosis 
Organisation University of Cambridge
Department School of Clinical Medicine
Country United Kingdom 
Sector Academic/University 
PI Contribution Statistical methods and analysis
Collaborator Contribution Providing motivating applications for statistical research
Impact Several publications
Start Year 2008
Description PGC 
Organisation Psychiatric GWAS Consortium (PGC)
Country Global 
Sector Academic/University 
PI Contribution Statistical advice
Collaborator Contribution Data analysis
Impact Publications
Start Year 2009
Description UWA 
Organisation University of Western Australia
Country Australia 
Sector Academic/University 
PI Contribution Expertise in statistical genetics
Collaborator Contribution Genetic studies of osteoporosis and thyroid disease
Impact Several publications
Description Genetic association analysis in nuclear families and unrelated subjects, allowing for missing genotype data and uncertain haplotypes 
Type Of Technology Software 
Year Produced 2006 
Open Source License? Yes  
Impact Over 1000 published applications by users 
Description Schizophrenia press release 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? Yes
Geographic Reach International
Primary Audience Media (as a channel to the public)
Results and Impact Authorship on a paper in Nature was accompanied by an MRC press release

Year(s) Of Engagement Activity 2009