Exploiting genetic information in the estimation of disease risk

Lead Research Organisation: London Sch of Hygiene and Trop Medicine
Department Name: Epidemiology and Population Health


We would like to be able to predict disease risk in advance of disease events occurring. This would allow us to effectively target preventative measures or treatment at those at highest risk. However, for many diseases this is very difficult to do, and our estimates of an individual s risk are very uncertain. One way to improve prediction is to incorporate genetic information, since we know that much of the variation in risk between individuals is attributable to genetic factors. So far attempts to do this have been concentrated on genetic variants which cause relatively large increases in risk; however, we know that there are relatively few of these, and many more variants of small effect. Thus most of the variation in risk between individuals is due to many genetic variants, each of which has taken alone a rather small effect on risk. To account for this we need to incorporate many more genetic variants into our predictive models. It is not clear how to do this; the objective of this grant is to develop improved ways of making predictions that incorporate information on many genetic variants simultaneously, and to assess the improvements that such models give relative to current predictive tools. The methods will then be applied to three diseases (breast cancer, coronary heart disease and pre-eclampsia).

Technical Summary

For many diseases, improved predictive models would be of great value. In principle, incorporation of genetic information into predictive models should allow improved estimation of individual risk, but to date such models have only considered the small number of (relatively) large effects that have been convincingly associated with disease. It is clear that these contribute only a small proportion of the total genetic variance: the rest is made up by a large number of smaller effects, together with interactions of various kinds. Incorporation of these smaller effects should improve predictive models. Although current GWA are underpowered to detect such effects at genome wide levels of significance, they do allow estimation of relatively small effects, albeit with considerable uncertainty. If properly weighted to account for this uncertainty, the estimates may then be used in individual level estimation of risk, potentially giving substantial improvements in the accuracy of such estimates. This project will develop novel methodology to use information from high density SNP data in the prediction of individual risk, primarily based around recent advances in Bayesian model selection/shrinkage techniques. The methods developed will be tested on simulated data, and applied to three diseases (breast cancer, coronary heart disease and pre-eclampsia) which differ greatly both in the number and size of identified genetic effects and the utility and type of current predictive models. A key focus will be on determining the potential value of genetic information in clinical prediction, and in examining the additional benefit over more established risk factors, including confirmed large effect genetic variants.


10 25 50

publication icon
Warren H (2012) 9q31.2-rs865686 as a susceptibility locus for estrogen receptor-positive breast cancer: evidence from the Breast Cancer Association Consortium. in Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology